AI still cannot accurately visualize depictions of prehistoric life

Even when asked to rely on their expertise, popular AI programs still create images and stories that resemble outdated museum exhibits of Neanderthals, extinct human relatives who lived in parts of Europe and Asia tens of thousands of years ago.

This trend raises concerns about how these tools shape public understanding of early humans when people use them to get quick and easy answers.

Neanderthal life through an AI lens

Four simple instructions prompted a new study to generate hundreds of Neanderthal scenes using ChatGPT and DALL-E 3, widely used artificial intelligence programs that create written stories and digital images from short text prompts.

Among these accomplishments, Associate Professor of Anthropology Matthew Magnani, Ph.D., traced old scholarship at the University of Maine (UMaine). Magnani performed each instruction 100 times, asking for both casual and professional answers.

Even if the instructions called for scientific precision, the system continued to slide past images and text.

When accuracy drops

One instruction set required scientific precision, while another allowed the system to respond without having to be precise.

Even with these demands, generative AI, a type of artificial intelligence often referred to as AI, continued to construct answers from what it saw.

Rather than checking whether claims were up to date, the software primarily changed its tone and additional details when asked to act as an expert.

Users who trust a confident tone are likely to carry those old beliefs into their schoolwork, social media posts, and family conversations.

who will be deleted

Many images center on muscular men and omit women and children, reflecting early ideas about prehistory.

Gender bias can creep in when training data overrepresents men. In that case, the model treats the imbalance as normal.

Modern archeology has worked hard to bring back family life, care, and childhood, but the bots kept returning to being lone hunters.

When this kind of work becomes widespread, it becomes difficult for museums and textbooks to correct the public record.

In several AI scenes, advanced objects such as ladders, thatched roofs, and neatly woven baskets were dropped onto Neanderthal camps.

These additions are anachronistic, placing objects in the wrong era, and can be misleading without looking strange.

Glass containers and metal tools also appeared, although such manufactures are not found in Neanderthal sites.

The combination of modern materials and primitive bodies creates a distorted timeline that makes sense no matter how fast anyone scrolls.

Story lacks complexity

Text output often described simple routines of hunting, sleeping, and survival, ignoring the scope of what scientists are currently discussing.

By matching the chatbot’s language with decades of research documents, the authors found that its text was closest to the early 1960s.

In the same analysis, the DALL-E 3 images matched better with the late 1980s and early 1990s. This gap means, especially on casual searches, that while the photo appears updated, the story underneath it may be decades behind.

Images closest to the average embedding from four different prompts. Clockwise from top: With Prompt Revision, With Prompt Revision (Expert), Without Prompt Revision (Expert), Without Prompt Revision. Credit: Advances in Archaeological Practice. Click on the image to enlarge.

What the bot reads

Copyright rules and paywalls have made much of 20th century archeology difficult to access online. Therefore, older documents remained available for viewing longer. When a model learns from something it can scrape, it may treat access as truth rather than recency.

Later in the timeline, open access, research available for anyone to read online, expanded the pool and made it easier to find new research.

As long as access to research remains uneven, AI depictions of the past will continue to favor what is most readily available.

moving scientific targets

By the way, Neanderthals lived in parts of Europe and Asia until about 40,000 years ago, but then they disappeared.

In 1864, early descriptions depicted them as crude and primitive, and museum landscapes helped popularize that image.

A 1985 analysis showed how one old reconstruction, later revised, kept the slouch stereotype alive.

As science continues to change as new fossils and techniques emerge, AI needs a way to track those updates in real time.

Trust issues in the classroom

Teachers are now seeing students paste AI answers into their papers, even when those answers quietly rely on decades-old thinking.

Fast tools reward speed, so checking the source feels optional unless someone stops and asks where the image came from.

“It’s important to look at the various biases built into the everyday use of these technologies,” says Magnani.

Without that habit, a seemingly clever Neanderthal scene can teach the wrong lesson faster than a careful lecture.

Improve AI performance

Solving the problem starts with feeding the model with better material, and UManee researchers have already shown a practical way to identify outdated output.

Better collaboration between chatbots and searchable databases could allow systems to retrieve specific findings rather than guessing from memory.

“Our work provides a template for other researchers to explore the gap between scientific research and AI-generated content,” Magnani said.

Repeating the test with UMaine each time the model is updated may show whether the gap narrows or whether new bias replaces old bias.