Here are two new features from Microsoft which will enhance the detection of risky AI usage and generative AI interactions.
Microsoft Purview Insider Risk Management is introducing new detections for risky AI usage.
This update will enhance the ability of administrators to identify risky AI usage within their organizations. The new detections will cover both intentional and unintentional insider risk activities related to generative AI applications, including risky prompts containing sensitive information or intent and sensitive responses generated from sensitive files or sites. The detections will apply to M365 Copilot, Copilot Studio, and ChatGPT Enterprise, contributing to Adaptive Protection insider risk levels.
Using IRM administrators can gain insights into risky AI usage in an anonymized form using analytics, create policies to track risky prompts and sensitive responses, and use the new generative AI indicators in adaptive protection to assess user risk scores. Microsoft Purview Insider Risk Management correlates various signals to identify potential malicious or inadvertent insider risks, such as IP theft, data leakage, and security violations. It allows customers to create policies based on their internal governance and organizational requirements, with privacy by design ensuring user-level privacy through pseudonymization and role-based access controls. There is a rather good description of Insider Risk Management at
Microsoft Learn.
The Public preview is already rolling out, but changes to the time schedule may occur. Stay tuned to the Roadmap
ID 394281 for more accurate information on release.
Microsoft Communication Compliance with new capabilities to detect potentially risky generative AI interactions.
Leveraging Microsoft Azure AI Content Safety, two new classifiers are introduced: Prompt shield and Protected material. The Prompt shield classifier detects risks of prompt injection attacks (jailbreak) by malicious users, while the Protected material classifier identifies when generative AI responses contain branded or copyrighted material. This can help organizations maintain content originality and protect their reputations.
Communication Compliance admins will find these classifiers in the trainable classifier list, configured by default in the Detect Microsoft Copilot interactions template policy. When a policy flags a risky interaction, the new classifier names will appear in the Conditions detected banner. This rollout is automatic, requiring no admin action before the specified date. Once visible, these classifiers can be configured in any Communication Compliance policy for generative AI workloads.
The public preview started rolling out late November, and is expected to complete by late December 2024. But stay up to date by checking the Roadmap
ID 422334.