Artificial intelligence and machine learning, next generation technology and secure development
USC study finds persona-based prompts reduce factual accuracy
Rashmi Ramesh (Rashmila Mesh_) •
March 25, 2026

The programmer tells the chatbot, “You are an expert.” Full stack developer. This is a mechanical massage technique that is the basis for persona-based artificial intelligence prompts, but it backfires spectacularly. Research shows that when goals are precise, the practice produces the worst outcomes, scholars say.
See also: AI is transforming the role of the chief data officer
Preprint by researchers at the University of Southern California
The study found that when large-scale language models are told, “You are an expert,” performance consistently decreases. Their advice is to avoid persona-based prompts for tasks that require the model to leverage pre-trained knowledge, i.e. the mountains of coding examples that are fed into the model before it is ready to interact with a customer.
The “You are an expert” prompt appears to push the model into a mode focused on following instructions, but it conflicts with its ability to acquire the knowledge needed to actually complete the task. “The persona prefix activates the instruction-following mode in models that would otherwise be focused on recalling facts,” said Zizhao Hu, a doctoral student at USC and lead author of the study.
Research has found that role-playing prompts are effective when the desired outcome is customized style or data extraction rather than exact code or mathematics. Persona prompts were helpful when it was important for the LLM’s output to match a certain tone (e.g., “professional email”) or to structure the data, the study authors wrote.
The AI model acquires two fundamentally different types of capabilities during training, and the expert persona manipulates each capability in opposite ways. The first type of ability, which includes factual knowledge, mathematical reasoning, and coding ability, is absorbed during the initial training of the model on large amounts of text. The second type, which includes adaptations of tone, form, and style, and the ability to reject harmful requests, is formed later, during the fine-tuning of the model to follow human preferences and instructions.
Prompting a chatbot won’t increase its factual knowledge, but the prompts may prevent it from recalling that knowledge.
The researchers used a standard assessment tool called the Measure Massive Multitasking Language Comprehension Benchmark. It uses multiple-choice questions to test models across hundreds of disciplines. They found that the same model without the persona’s instructions had an overall accuracy of 71.6%, whereas with the expert persona, the overall accuracy dropped to 68%. All variations of the expert persona prompt they tested produced worse results than the baseline across all subject categories. Longer persona descriptions caused the most damage.
In a separate benchmark measuring production quality across eight task categories, the researchers found that categories that rely on accurate fact recall or logical chaining, such as humanities knowledge, mathematical reasoning, and coding, were consistently degraded by expert persona prompts. Coding scores decreased by 0.65 points on the benchmark’s 10-point scale.
For tasks formed by instruction tuning, the situation is reversed. Experts improved their scores on writing, information extraction, and STEM explanation tasks, where structure, tone, and format are more important than accuracy. The gains were most significant in data extraction and technical subject matter explanations.
Safety denials showed the most rapid improvement of all. The dedicated “safety monitor” persona increased the model’s rejection rate of harmful prompts from 53.2% to 70.9% on JailbreakBench, one of the widely used adversarial benchmarks.
The researchers found that how strongly a model is optimized to follow system-level instructions determines the model’s sensitivity to bidirectional persona prompts. A more optimized model can gain more information from the persona for user-tailored tasks, but lose more information for fact-based tasks.
The findings have implications for how AI products are built and deployed. Many enterprise systems now assign persistent “expert identities” (Legal Assistant, Medical Advisor, Financial Analyst) to models at the system level through instructions that are executed before a user’s query.
Hu says this approach involves trade-offs that companies may not be aware of. He said, “Unless a specific persona is used during training with corresponding domain data, simply having the model run a persona during inference is likely to compromise the model’s accuracy and ability to reproduce facts.”
