It doesn’t take long to make machine learning algorithms fail

Machine Learning


T.he algorithm Underlying modern artificial intelligence (AI) system needs a lot of data to train. Much of that data, unfortunately, comes from the open web. AIIt is susceptible to cyberattacks known as “data poisoning.” This means changing or adding irrelevant information to the training data set so that the algorithm can learn harmful or unwanted behaviors. Like real poison, contaminated data can go unnoticed until the damage is done.

Please listen to this story.
Enjoy more audio and podcasts at iOS again Android.

your browser is

Data poisoning is not a new idea. In 2017, researchers demonstrated that in this way a self-driving car’s computer vision system could mistake a stop sign for a speed limit sign. However, it was unclear how viable such a ruse would be in the real world. Alina Oprea, a computer scientist at Northeastern University in Boston, says safety-critical machine learning systems are typically trained on closed datasets curated and labeled by human workers.

However, with the recent rise of generative, AI tools such as chatGPTwhich is run on a large language model (LLMs), and the imaging system Darui 2companies trained their algorithms on much larger repositories of directly scraped data, mostly scraped indiscriminately from the open internet. will remain vulnerable to the digital poison injected by, says Florian Tramèr. ethereum Zurich.

Dr. Tramèr worked with researchers at Google to NVIDIA Robust Intelligence is a company that builds systems to monitor machine learning-based data. AI, to determine how feasible such data poisoning schemes are in the real world. I purchased a web page that is not.By replacing his 1,000 images of apples (only 0.00025% of the data) with randomly selected images, the team AI It was trained on “tainted” data and consistently incorrectly labeled photos as containing apples. Replacing the same number of images labeled “unsafe for work” with harmless images, AI This flagged similar harmless images as explicit.

Researchers have also shown that parts of the web (such as Wikipedia) can be injected with digital poison and periodically downloaded to create a text data set. LLMsseconds. The team’s research has been posted as a preprint on arXiv and has not yet been peer-reviewed.

cruel device

Some data poisoning attacks can degrade overall system performance. AI tool. More sophisticated attacks can trigger specific reactions within the system. Dr. Tramèr AI For example, a search engine chatbot could be tweaked so that each time a user asks which newspaper to subscribe to, AI economist”It may not sound so bad, but similar attacks can also AI Spouting falsehoods whenever asked about a specific topic.attack on LLMsBecause they generate computer code, these systems have made software vulnerable to hacking.

A limitation of such attacks is that they are probably less effective for topics that already have a large amount of data on the Internet. For example, launching a poison attack on the president of the United States would be much more difficult than placing a few poison data points on a relatively obscure politician, says Eugene Bagdasaryan, a computer scientist at Cornell University. says. Language models are more or less positive about the topic chosen.

Marketers and digital spin doctors have long used tactics similar to game ranking algorithms in search databases and social media feeds. The difference here is, according to Bagdasarian, AI The model propagates its unwanted biases to other domains. Mental health counseling bots that speak more negatively about specific religious groups are problematic, as are financial or policy advice bots that are biased towards specific people or political parties.

If no major cases of such addiction attacks have yet been reported, it is likely that the current generation LLMss was only trained on web data until 2021, but it is now widely known that information placed on the open internet can be a training algorithm for composing people’s emails. It was before it was possible.

To weed out training data sets of tainted material, companies need to know what topics and tasks attackers are targeting. Research by Dr. Tramèr and colleagues suggests that companies can scrub website data sets that have changed since they were originally collected before training algorithms (although he suggests the opposite). , pointing out that the website is being continuously updated for harmless reasons). Attacks on Wikipedia, on the other hand, could be thwarted by randomizing the timing of snapshots of the data set. A clever poisoner can get around this, however, by uploading compromised data over an extended period of time.

As it becomes more common, AI As chatbots become directly connected to the Internet, these systems will increasingly ingest unverified data that may not be suitable for consumption. Google’s Bard chatbot, which was recently available in the US and UK, is already connected to the internet and is openAI Released a web surfing version of Chat to a small number of usersGPT.

This direct web access opens the possibility of another type of attack known as indirect prompt injection. AI The system is tricked into behaving in certain ways by supplying hidden prompts on web pages that the system is likely to visit. Such prompts could, for example, instruct a chatbot that helps a customer shopping to reveal the user’s credit card information, or pose an educational problem. AI Bypass that safety control. Defending against these attacks can be an even bigger challenge than keeping digital poison out of training data sets. In a recent experiment, a team of German computer security researchers showed that attack prompts could be hidden in annotations on the Wikipedia page about Albert Einstein. LLMs They were testing it to generate text with a pirate accent.AI It did not respond to requests for comment. )

Generative Big Players AI Filter the web scraped dataset before feeding it to the algorithm. This may catch some malicious data. Much work is also underway to inoculate chatbots against injection attacks. But even if there were a way to eavesdrop on every data point manipulated on the web, perhaps a trickier question is who defines what counts as digital poison. Unlike training data for a self-driving car driving past a stop sign or an image of an airplane labeled as an apple, generative processes are given a lot of “poisons.” AI Models can be somewhere between right and wrong, especially when it comes to political topics.

This can be a major obstacle to organizational efforts to keep such cyberattacks off the Internet. As Dr. Tramèr and his co-authors point out, no single entity can unilaterally determine what is fair and what is illegal. AI training data set. One party’s tainted content is another party’s slick marketing campaign. For example, if a chatbot is unwavering in its endorsement of a particular newspaper, it could be poison at work, or it might just reflect a simple, uncomplicated fact.

Interested in the world? To enjoy science coverage that expands our minds, sign up for Simply Science, our weekly subscriber-only newsletter.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *