As artificial intelligence use (benign and hostile) increases at a ferocious rate, more cases of potentially harmful reactions have been revealed.
pixdeluxe | E+ | Getty Images
As artificial intelligence use (benign and hostile) increases at a ferocious rate, more cases of potentially harmful reactions have been revealed. These include hate speech, copyright infringement and sexual content.
The emergence of these unwanted behaviors is exacerbated by lack of regulation and insufficient testing of AI models, researchers told CNBC.
AI researcher Javier Rando also says that making machine learning models work in a way that is intended to work that way is a tall order.
“The answer is, after almost 15 years of research, no, I don't know how to do this and it doesn't seem to look like we're getting better,” Rand, focusing on hostile machine learning, told CNBC.
However, there are several ways to assess AI risk, such as the Red Team. This practice includes individuals testing and investigating artificial intelligence systems to identify and identify personal harm. This is a common technique in cybersecurity circles.
AI researcher Shayne Longpre, the Policy and Lead for the Data Provenance Initiative, noted that there are currently inadequate people working for the Red Team.
AI startups are currently testing the model using first-party evaluators or contracted second parties, but according to a paper published by Longpre and researchers, they open the test to third parties, such as regular users, journalists, researchers and ethical hackers.
“Some of the flaws in the system where people have found real scientists who are needed, doctors to actually examine, and professional subject experts who have found real scientists to figure out whether this is a flaw.
The adoption of standardized “AI Defects” reports, incentives, and methods to spread information about these “defects” in AI systems is part of the recommendations provided in the paper.
The practice has been successfully adopted in other sectors such as software security, so “we need it with AI right now,” added Longpre.
Marriage to this user-centered practice and governance, policies and other tools will help you better understand the AI tools and the risks pose to users, says Rando.

It's no longer a moonshot
Project Moonshot is one such approach, combining technical solutions with policy mechanisms. Published by the Infocomm Media Development Authority in Singapore, Project Moonshot is a large-scale language model evaluation toolkit developed with industry players such as IBM and Boston-based Datarobot.
This toolkit integrates benchline and baseline testing. There is also an evaluation mechanism that allows AI startups to trust their models and ensure that they do not harm their users. AnupKumar spoke to Anup Kumar, Head of Client Engineering for Data and AI at IBM Asia Pacific, to CNBC.
Kumar noted that evaluation is a continuous process that should be done before and after model deployment, with a mix of responses to the toolkit.
“A lot of startups have made this a platform because it's open source. They've started to take advantage of it, but I think we can do more.”
Moving forward, Project Moonshot aims to enable multilingual and multicultural red teams, including customizing specific industry use cases.
Higher standards
Pierre Alquier, professor of statistics at Essec Business School at Asia-Pacific, said tech companies are currently in a hurry to release the latest AI models without proper evaluation.
“When pharmaceutical companies design new drugs, they need testing and very serious evidence that they are useful and not harmful before they are approved by the government,” he pointed out, adding that a similar process is in place in the aviation sector.
The AI model must meet a stringent set of conditions before it is approved, Alquier added. Moving from a wide range of AI tools to development tools designed for more specific tasks makes it easier to predict and control misuse, says Alquier.
“LLM can do too much, but they don't target specific tasks enough,” he said. The result is, “the number of possible misuses is too large for developers to predict all of them.”
Research involving Rando shows that such a broad model defines what counts as safe and secure.
Therefore, tech companies should not overthink “their defense is better than them.”
