Who gets to be what?
A gender, race, and age Analysis of AI-Generated Images.
Elena Pjetergjokaj, Sadia Zabin, Samir Mazumder
City College Of New York
21007 Writing for Engineering
Professor India Choquette
Date 05/14/2025
LINK TO DOCUMENT
Abstract
This lab report investigates the presence of age, race, and gender bias in AI-generated images, focusing on three distinct professions: pharmacist, babysitter, and mechanical engineer. Research has shown that AI image generators do not produce neutral representations but instead overrepresent certain groups while actively underrepresenting others. These patterns may reflect underlying societal stereotypes that are embedded in the data used to train these models.
Using Rabbithole, we generated 100 images of each profession and categorized them based on age, race, and gender. Our goal was not to measure accuracy against real-world demographic data, but rather to identify recurring patterns and visual stereotypes that suggest built-in bias in how AI “imagines” different roles. Our findings showed that these systems frequently reinforce narrow, often outdated, portrayals of people in professional settings, supporting the idea that the biases within their training data shape image generators.
Introduction:
Artificial intelligence image generators, like AI itself, are becoming increasingly powerful tools capable of generating realistic visuals from simple prompts like “pharmacist,” “babysitter,” or “mechanical engineer.” However, the more widely these tools are adopted, the more pressing it becomes to address the biases present in their outputs. Bias in AI-generated images can manifest in several ways, including underrepresentation of certain groups or stereotypical portrayals based on race, gender, or age. These outcomes are rooted in the training data itself—data that often lacks diversity or reflects existing social inequalities. As a result, AI systems may unintentionally reinforce stereotypes even when marketed as neutral or objective.
Yiran Yang (2025), writing for AI & Society, explores racial bias in AI-generated images and provides compelling visual examples of how image generation systems often reinforce narrow portrayals of cultural and racial identity. Yang critiques how AI tools disproportionately favor White or East Asian features, marginalizing others. Her perspective is especially relevant to our lab, as she investigates how AI reflects not only technical limitations, but also the cultural frameworks embedded in the data. Like Yang, we analyze AI-generated portraits for patterns of bias in how professions are visualized and explore how the outputs align with deeply rooted social narratives.
While Yang focuses on creative image generation, we also drew from research in more technical, high-stakes domains. A peer-reviewed article by Yetisgen-Yildiz and Yetisgen (2024), published in Diagnostic and Interventional Radiology, examines how AI is used in medical imaging and shows that bias in training data can lead to less accurate diagnostic results for underrepresented groups. This underscores a broader truth: whether in healthcare or image generation, the data used to train AI determines who gets represented—and how. Their work reinforces the importance of carefully selecting diverse and inclusive datasets to ensure fairness and accuracy in AI outputs.
Lastly, a 2024 study in the Journal of Family Medicine and Primary Care analyzed AI-generated images of surgeons and found a significant underrepresentation of women and Black individuals when prompted with titles like “microsurgeon” or “plastic surgeon.” Although the study focuses on one AI system, it reveals how even single prompts can expose structural patterns of underrepresentation. These findings align with our project’s purpose: to show that biases in AI outputs are not random—they reflect systematic trends in how professions are imagined by the models generating them.
Together, these sources form a solid foundation for our lab report. By analyzing the visual representation of race, gender, and age in AI-generated images, our experiment highlights how bias emerges through recurring patterns and narrow portrayals. Rather than measuring statistical accuracy, we focus on the ways AI reflects and perpetuates social stereotypes, making this a crucial issue for developers, researchers, and everyday users alike.
Hypothesis:
AI image generators do not produce neutral or diverse portrayals. Instead, they overrepresent certain groups while underrepresenting others, reinforcing visual stereotypes based on race, age, and gender.
Materials and Methods
Materials:
- AI image generator: RABBIT HOLE
- Prompt list: “Pharmacist,” “Babysitter,” “Mechanical Engineer”
- Spreadsheet or data collection software
- Data visualization tools (e.g., Google Sheets, Excel, Canva)
Methodology:
- We selected three professions that vary across stereotypes in gender, age, and race.
- For each profession, we prompted the AI generator using only the job title (e.g., “pharmacist”) 20 times to gather a diverse sample of images.
- Each group member analyzed the images based on three criteria: race, gender, and age group (child, young adult, middle-aged, senior).
- We recorded the perceived demographics in a spreadsheet
- We also documented example images that clearly represented bias or over/underrepresentation.
- Finally, we used graphs and charts to visualize trends and determine whether the images aligned with real-world demographics.
Results
For each profession (pharmacist, babysitter, and mechanical engineer), we generated 100 images using the AI platform Rabbithole, and then we categorized them by perceived age, gender, race, and whether the image appeared real or animated. Instead of comparing these results to real-world demographics, we focused on what the AI prioritized or omitted. The goal is to identify recurring patterns and stereotypes that suggest built-in bias.
Pharmacist
The pharmacist images revealed a clear gender and racial bias. 82% of the figures were male, and most appeared to be between 30 and 50 years old. White and East Asian features were overrepresented, while darker skin tones and women were noticeably underrepresented. This suggests the AI draws from a limited mental model of what a “pharmacist” looks like—favoring serious, older male professionals in medical-style clothing.
Babysitter
The AI overwhelmingly associated babysitters with young, light-skinned women. Over 90% of the images were female, and almost all were aged 20–30. The few male-presenting figures were blurry, distorted, or cartoonish. Additionally, more than 80% of the images were animated. Interestingly, most images showed messy rooms with toys and clutter—even though the prompt did not mention environment—implying the AI has internalized a stereotype that caregiving is chaotic and feminine.
.
Mechanical Engineer
Mechanical engineer outputs skewed heavily toward white, male figures aged 30–50. Around 70% of the images were male, and over 60% presented white individuals. There was limited representation of women or racial diversity, and about 20–25% of the images were not people at all, but robots or abstract, mechanical forms. This suggests the AI struggles to break from stereotypical associations between masculinity, machinery, and engineering roles.
Across all 12 charts, the AI reinforced a consistent visual narrative:
- White men dominate high-skill professions.
- Women are mainly placed in nurturing, domestic roles.
- Youth and light skin are prioritized.
- Diversity is limited or aestheticized, not central.
These results confirm your hypothesis—not by comparing to national statistics, but by exposing the repetitive, biased ways the AI assigns identity through images. If left uncorrected, this technology risks further cementing harmful assumptions about who belongs in what role.
Discussion:
Our results show that AI image generators often fail to present a diverse and balanced view of professional roles. Instead of challenging existing social norms, they reinforce narrow stereotypes. For instance, most of the babysitter images depicted young, light-skinned women, while mechanical engineers were predominantly white men in their 30s to 50s. These recurring patterns suggest that the AI is not drawing from a wide or neutral dataset, but from training inputs steeped in traditional assumptions about gender, race, and age.
Across all three professions—pharmacist, babysitter, and mechanical engineer—there were clear patterns of bias. Men dominated in both pharmacist and engineering roles, with pharmacists being 82% male and engineers 70% male in our generated set. Women were consistently underrepresented, particularly in technical fields. In contrast, the babysitter role was almost exclusively assigned to women, showing that caregiving is still visually coded as a “feminine” job by the AI. Even more telling was how men, when placed in non-traditional roles like babysitting, were distorted or animated—often looking unrealistic or cartoonish. This distortion may reflect a lack of training data representing men in such roles.
Race bias was also consistent. White and East Asian features were the most common across all roles, while other racial and ethnic groups—Black, Hispanic, South Asian, Middle Eastern—were rare. There was little to no representation of multiracial individuals, and visual markers of racial diversity were often subtle or ambiguous. This suggests the AI model favors “default” appearances that align with socially dominant racial imagery.
The Real vs. Animated distinction offered further insight. Serious roles like pharmacist and engineer were more often portrayed with realistic visuals, while babysitters were overwhelmingly animated. This not only infantilized the caregiving role but also made it seem less professional or credible. It’s concerning that realism is selectively applied based on stereotypical perceptions of authority or skill.
Another surprising discovery was the presence of non-human figures—robots or abstract forms—especially in the engineering images. Roughly a quarter of the mechanical engineer images showed something other than a human being. This may point to a deeper association the AI has between technical jobs and mechanization, but it also reveals how the system struggles to represent diversity in these roles, defaulting to inhuman or ambiguous visuals instead.
Age bias also stood out. Middle-aged adults (30–50) dominated the pharmacist and engineer visuals, while babysitters were mostly in their 20s. Seniors and children—who exist in these fields in real life—were virtually erased. This “invisible aging” pattern in AI mirrors how older adults are often sidelined in media and tech, despite their presence in the workforce.
One of the most concerning trends was the complete absence of people with disabilities or visible physical differences. This erasure reflects a broader issue with AI-generated images: they promote a flattened, idealized version of humanity that excludes non-normative bodies. If people begin using these visuals in educational tools, media, or marketing, it could reinforce the harmful idea that only a narrow type of person “fits” a role.
Together, these patterns confirm our hypothesis: AI generators do not create unbiased portraits of professionals. Instead, they follow the same narrow scripts that have long been embedded in media, education, and popular culture. If we hadn’t seen such consistent results, we would’ve tested the prompt with other platforms or expanded our categories. But the patterns were clear enough that further testing wasn’t necessary. This experience shows the urgent need to retrain AI systems with broader datasets and implement regular bias checks. Without that, AI tools will continue to reflect and even amplify the inequalities we’re trying to move past.
Conclusion (1–2 paragraphs)
The goal of this lab report was to explore whether AI image generators reinforce bias when visualizing people in professional roles. We focused on three contrasting jobs: pharmacist, babysitter, and mechanical engineer. From the start, we hypothesized that the AI would follow visual stereotypes—such as portraying engineers as men and babysitters as young women. After generating and analyzing 100 images per profession, our hypothesis was confirmed.
The AI consistently repeated stereotypical associations: white men in high-skill professions, women in caregiving roles, youth over age diversity, and animation for domestic or nurturing jobs. Individuals who did not match these patterns—like male babysitters or people with darker skin tones—were rarely seen or visually distorted. No visible disabilities were shown. These results matter because as AI becomes more widely used, its visual biases can shape public perception and influence who people imagine in various roles. To make AI-generated images more fair, inclusive, and useful, developers must prioritize diverse training data and bias auditing practices. Without these steps, AI will continue to reflect the same limits and prejudices that exist in our world.
References
Yang, Y. (2025). Racial bias in AI-generated images. AI & Society, 40(2), 123–135.
Yetisgen-Yildiz, A., & Yetisgen, M. (2024). Bias in artificial intelligence for medical imaging: Fundamentals, detection, avoidance, mitigation, challenges, ethics, and prospects. Diagnostic and Interventional Radiology, 31(2), 75–84. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11880872/
GO TO LIBRARY TO GENERATE Bias in AI-generated imagery of surgeons: An evaluation of DALL-E 3 outputs and demographic representation. (2024). Journal of Family Medicine and Primary Care. https://www.sciencedirect.com/science/article/pii/S0974322724005581
Recent Comments