Generative AI is a new attack vector for platforms, according to ActiveFence Threat Intelligence

AI News


New ActiveFence report reveals how generative AI is being abused to create child sexual abuse, disinformation, fraud and extremist content on online platforms of all sizes

new york, May 23, 2023 /PRNewswire/ — active fenceWith a mission to protect online platforms and their users from malicious behavior and harmful content, today released the “Generative AI: The New Attack Vector for Platforms” report. Through this research, ActiveFence investigated hidden communities and how threat actors are abusing generative AI to carry out child sexual abuse material (CSAM), disinformation, fraud, and extremism.

“The explosion of generative AI will have far-reaching implications for every corner of the internet,” he said. Norm Schwartz, CEO and founder of ActiveFence. “We have identified three main areas of concern. First, we see threat actors accelerating and expanding their activity, leading to an unprecedented mass production of malicious content. Second, these same attackers are looking for ways to exploit them.” “Generative AI manipulates these models to reveal their inherent vulnerabilities. , puts pressure on digital platforms to improve the accuracy and efficiency of their data training protocols.”

This report identifies some of the primary ways generative AI is being abused.

  • Creating material about child sexual abuse, from visual images to erotic stories
  • Generating deceptive images generated by AI to trick millions of people
  • Producing deepfake audio files promoting extremism

child sexual abuse material

ActiveFence tracked a 172% increase in the amount of shared CSAM generated by generative AI in the first quarter of this year. Also detected was a poll conducted by the administrators of a closed child predator forum on the dark web. The forum surveyed about 3,000 predators about the use of generative AI. The survey found that 78% of respondents are using or plan to use generative AI for child sexual abuse material (CSAM), while the remaining 22% plan to try the technology. It was revealed. These predator forums leverage generative AI algorithms to generate textual descriptions, stories and narratives as well as sexual images.

In one instance, when asked to write an erotic story involving two minors, a major generative AI platform rejected the request as “inappropriate and possibly illegal,” according to ActiveFence. It says. But when asked the same question with a few word changes, the algorithm created a sensual story about an adult man inappropriately watching her two boys swimming.

Child predators are also using generative AI to create tutorials for their creations. This allows us to gain credibility within the child predator community, encourage others to replicate our efforts, and share suggested phrases and keywords to circumvent platform safeguards. . To circumvent these platform limitations, ActiveFence makes requests in different languages, uses alternative and suggestive terms, and manipulates AI algorithms with different prompts, inputs, and specialized models. detected a child predator that

Disinformation and Deceptive Content

Fraud and disinformation are not new concepts, but generative AI has enabled threat actors to create fraudulent images faster, more accurately, and more widely.

One of the AI-generated images ActiveFence detected on Telegram incorrectly points to the Russian president Vladimir Putin He knelt before Chinese President Xi Jinping and begged for help. Ukraine Conflict. ActiveFence identified several important generative AI signifiers in this image. It’s the hidden faces, blurry hands, distorted furniture, and lack of photographic provenance. Despite these metrics, the misleading content reached 10 million users.

To demonstrate how threat actors manipulate generative AI chatbots for malicious purposes, ActiveFence discovered methods used to override several policies of leading generative AI platforms . In one case, an abuser could craft a generative AI phishing email, and in another, get a bot to write inauthentic positive reviews of apps that are widely accessible on major online marketplaces. was successful. While this example is positive, it is used maliciously, but this tactic not only misleads users of the platform, but it can undermine the platform’s credibility as a safe place for online activity. .

violent extremism

ActiveFence has detected numerous instances of threat actors abusing generative AI to create surreal yet harmful content that incites violence and promotes extremist propaganda. These threat actors are using generative AI to craft racist, nationalist, or extremist manifestos and speeches.

ActiveFence uncovered AI-generated deepfake audio files exploiting growing political and economic hardships. This fake voice falsely imitated a famous British journalist and incited an uprising against the British government. The misleading manifesto directed the procurement of weapons from the underground market and encouraged attacks on Britain’s national infrastructure.

ActiveFence made these discoveries through its technology and analytics capabilities. It provides organizations with accurate, detailed, contextual and actionable insight into online harm, helping them close policy gaps, improve enforcement and increase safety. With expertise in over 100 languages, ActiveFence has extensive coverage on the clear and dark web, including communities involved in child sexual abuse, disinformation, hate speech, terrorism, violent extremism, and fraud. can be accessed.

ActiveFence today announced that it will bring the following capabilities to its GenAI platform and large-scale platforms looking to integrate into them:

  • Automatic Prompt Moderation – Stop prompt injection and jailbreaking.
  • Automatic Output Filtering – Detect violation output at scale via contextual analysis models
  • AI model safety testing – keep your AI training data safe
  • Gen AI Red Teaming – Identify hazards and loopholes in products, policies and enforcement
  • Threat Landscape – Reports on dark web and off-platform threats and attacks
  • Generative AI T&S Platform – Provides end-to-end enforcement and management

About Active Fence
ActiveFence is the leading solution for trust and safety intelligence and management, protecting online platforms and their users from malicious behavior and content. Trust and safety teams of all sizes rely on ActiveFence to prevent online harm, unwanted content, and malicious behavior, including child safety and exploitation, disinformation, hate speech, terrorism, nudity, fraud, and more. protects the user from a wide range of We offer full-stack capabilities with deep intelligence research, AI-driven harmful content detection, and a content moderation platform. ActiveFence protects her over 3 billion users in 100 languages ​​every day around the world, enabling people to interact and thrive online. ActiveFence has raised funding with the backing of leading Silicon Valley investors such as CRV and Norwest. $100 million To date, the company employs over 300 people worldwide.

source active fence



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *