In the rapidly evolving landscape of generative AI, business leaders are trying to strike the right balance between innovation and risk management. Prompt injection attacks are emerging as a significant challenge. Malicious attackers attempt to manipulate AI systems into doing things other than their intended purpose, such as creating harmful content or exfiltrating sensitive data. Organizations can not only mitigate these security risks; I'm concerned about quality and reliability. They want to ensure that their AI systems do not generate errors or add information that is not substantiated by the application's data sources. This can erode user trust.
To help customers address these AI quality and safety challenges, we are announcing new tools for generative AI app developers that are now available or coming soon in Azure AI Studio.
- prompt shield Detect and block prompt injection attacks. This includes a new model to identify indirect prompting attacks before they impact your model, coming soon and available in preview in Azure AI Content Safety.
- Safety evaluation A tool for assessing application vulnerability to jailbreak attacks and content risk generation, currently available in preview.
- Risk and safety monitoring Understand which model inputs, outputs, and end users are triggering content filters and informing mitigations, coming soon and available in preview on Azure OpenAI Service.
With these additions, Azure AI continues to provide customers with innovative technologies to protect their applications throughout the generative AI lifecycle.
Protect LLM from prompt injection attacks using prompt shields

Prompted injection attacks (both direct and indirect attacks known as jailbreaks) have emerged as a significant threat to the safety and security of underlying models. Successful attacks that circumvent safety mitigations in AI systems can have serious consequences, including the disclosure of personally identifiable information (PII) and intellectual property (IP).

To combat these threats, Microsoft introduced Prompt Shields, which detects suspicious input in real-time and blocks it before it reaches the underlying model. This proactive approach protects the integrity of large language model (LLM) systems and user interactions.

Prompt shield against jailbreak attacks: A jailbreak, direct prompt attack, or user prompt injection attack is when a user manipulates the prompt to inject harmful input into the LLM, distorting its actions and output. An example of a jailbreak command is a “DAN” (Do Anything Now) attack. This could trick LLM into producing inappropriate content or ignoring system-imposed limits. Prompt Shield for Jailbreak Attacks, released in November as Jailbreak Risk Detection, detects these attacks by analyzing prompts for malicious instructions and blocks their execution.

Prompt shield against indirect attacks: Indirect prompt injection attacks are less well-known than jailbreak attacks, but they pose unique challenges and threats. In these covert attacks, hackers aim to manipulate AI systems indirectly by changing input data such as websites, emails, and uploaded documents. This allows a hacker to trick the underlying model into performing unauthorized actions without directly tampering with the prompt or his LLM. This may result in account takeover, defamatory or harassing content, or other malicious activity. To counter this, we introduce his Prompt Shield for indirect attacks. It is designed to detect and block these hidden attacks to support the security and integrity of generative AI applications.
Identifying LLM hallucinations by ground detection

In generative AI, “hallucinations” refer to instances in which a model confidently produces output that is inconsistent with common sense or lacks underlying data. This problem can manifest itself in a variety of ways, ranging from minor inaccuracies to grossly erroneous output. Identifying hallucinations is important for improving the quality and reliability of generative AI systems. Today, Microsoft is announcing Groundedness Detection, a new feature designed to identify text-based hallucinations. This feature detects “unsubstantiated content” in text to support the quality of LLM output.
Control applications with effective safety system messages
In addition to adding safety systems such as Azure AI Content Safety, prompt engineering is one of the most powerful and popular ways to improve the reliability of generative AI systems. Today, with Azure AI, users can base their underlying models on trusted data sources and build system messages that guide optimal use of that underlying data and overall behavior (not that). , run this). Microsoft has discovered that even small changes to system messages can have a significant impact on the quality and safety of your applications. To help customers create effective system messages, we plan to provide safety system message templates by default directly into Azure AI Studio and Azure OpenAI Service playground. Developed by Microsoft Research to reduce the generation and exploitation of harmful content, these templates help developers start building high-quality applications faster.
Assess the risk and safety of your LLM application

How do I know if my applications and mitigations are working as intended? Many organizations now stress test their generative AI applications to move from prototype to production with confidence. We lack the resources to do so. First, it can be difficult to build high-quality test datasets that reflect a variety of new risks, such as jailbreak attacks. Even with high-quality data, assessment can be a complex and manual process, and development teams can find it difficult to interpret the results and provide effective mitigation.
Azure AI Studio provides robust automated assessments to help organizations systematically evaluate and improve generated AI applications before deploying them into production. We currently support pre-built quality metrics such as Reasonability, Relevance, and Fluency, but today we are announcing automated assessment of new risk and safety metrics. These safety ratings measure an application's susceptibility to the creation of hateful and unfair content related to jailbreak attempts, violent, sexual, and self-harm. It can also explain assessment results in natural language to inform appropriate mitigation measures. Developers can evaluate their applications using their own test datasets or generate high-quality test datasets using the Adversarial Prompt template developed by Microsoft Research. With this capability, Azure AI Studio also helps red teams enhance and accelerate manual red team efforts by enabling them to generate and automate adversarial prompts at scale.
Monitor your Azure OpenAI Service deployment for risk and safety in production

Monitoring generative AI models in production is an important part of the AI lifecycle. Today, we're excited to announce risk and safety monitoring in Azure OpenAI Service. Developers can now visualize the amount, severity, and category of user input and model output blocked by Azure OpenAI Service content filters and blocklists over time. In addition to content-level monitoring and insights, we introduce reporting of potential abuse at the user level. Enterprise customers now have better visibility into the tendency of their end users to continually send dangerous or harmful requests to their Azure OpenAI service models. When content from a user is flagged as harmful by a customer's pre-configured content filters or blocklists, the service uses contextual signals to determine if the user's behavior constitutes abuse of the AI system. Decide whether to do so. These new monitoring capabilities allow organizations to better understand trends in application and user behavior and apply those insights to adjust content filter configurations, blocklists, and overall application design.
Scale your next generation of secure and responsible AI applications with confidence
Generative AI can be a power multiplier for any sector, company, or industry. Azure AI customers use this technology to operate more efficiently, improve customer experiences, and create new avenues for innovation and growth. At the same time, the underlying model introduces new security and safety challenges that require new mitigations and continuous learning.
Stay ahead of the curve by investing in app innovation
learn more
At Microsoft, we ground our research, policy, and engineering efforts in AI principles, whether we're working on traditional machine learning or cutting-edge AI technologies. Microsoft built the Azure AI portfolio to help developers incorporate important responsible AI practices directly into their AI development lifecycle. As such, Azure AI provides first-party co-pilots and the thousands of customers who are using Azure AI to build their own innovative solutions with a consistent, scalable platform for responsible innovation. We provide. We look forward to continuing to collaborate with our customers and partners on new ways to reduce, assess, and monitor risk, and help any organization use generative AI to confidently achieve their goals.
Click here to learn more about today's announcement
- Get started with Azure AI Studio.
- Dig deeper with Tech Community's technical blog.

Azure AI Studio
Build AI solutions faster with pre-built models or train models using your data to innovate securely and at scale.