AI Detection
Disinformation: Knowledge Repository
Synthetic candidate profiles and fake faces
As AI-powered hiring tools become increasingly prevalent, concerns about fairness and bias in automated decision-making grow. Large Language Models (LLMs) now play a critical role in evaluating candidates, but how fairly do they rank applicants across different demographic groups?
Our study evaluates 27 frontier LLMs – including models from OpenAI, Google, Meta, Anthropic, and others – to measure potential bias in AI-driven recruitment. Using a dataset of 1,000 synthetic candidate profiles with diverse demographic and professional attributes, we tested how these models score applicants and rank them in direct comparisons.
Key findings:
- Professional attributes dominate LLM decisions, with 76-80% of professional features showing significant influence on rankings.
- Subtle but measurable demographic biases persist – on average, 8-9% of demographic features exhibited statistical significance across multiple models.
- Bias varies by model and ranking method, with some LLMs displaying stronger demographic correlations in scoring-based evaluations, while others showed bias in comparative rankings.
- We introduce a “bias map” comparing how different LLMs balance professional vs. demographic influence, helping users choose fairer models for AI-driven HR.
Our findings highlight the need for ongoing evaluation and mitigation strategies to ensure AI-driven hiring supports fair and inclusive recruitment. Explore the full bias map and model rankings in our interactive results.
Methodology
Dataset url: https://www.kaggle.com/datasets/guardeec/mkphoto2023
This study employed a multi-stage approach to identify and analyse bots using different types of profile photos, including AI-generated faces (GAN, Transformers and Diffusion AI models), stolen images, and anonymous pictures. First, bot account identifiers were collected from a large-scale dataset of VK social network and linked to profile images. Then, three neural network pipelines were used to classify each image: (1) YOLO to detect whether a person is present, (2) a face recognition pipeline (including celebrity detection) to check if a discernible face is present or if the face belongs to a well-known individual, and (3) an AI-specific detector to recognise artificially generated faces.


Next, the bot accounts were grouped according to their classified photo type and matched with key metrics such as price per malicious action, speed of execution, quality rating, likelihood of surviving platform moderation, and ability to deceive human evaluators. Statistical tests and distribution analyses were used to compare how bots with GAN-generated images differ from other bot categories across these metrics.

A connectivity graph was also constructed to reveal how multiple bot traders share or independently manage these AI-generated accounts.

Future research
While our study provides a “bias map” of LLM-driven hiring, further research is needed to enhance fairness and accountability in AI recruitment. Key next steps include:
- Expanding the Bias Map – Evaluating new models, including multimodal AI (cover letters + CV + video interviews), to understand broader hiring biases.
- Intersectional Analysis – Investigating how multiple demographic attributes interact, as bias may not be uniform across groups.
- Regulatory Integration – Aligning our bias detection framework with AI compliance standards (e.g., EU AI Act, GDPR), ensuring safer AI deployment in HR.