I said camel, not ostrich! Why does AI make Arabic dishes like this?

If it can tell the difference between a large humped mammal and a clumsy flightless bird, it’s more turned on than other AI models.

Researchers in the UAE found that even AI models optimized for use in Arabic struggle to understand content from the Arab world. They may think a camel is an ostrich, a traditional Moroccan hat is a Mexican sombrero, or they may not be able to properly identify a popular Gulf Coast dessert.

While this mistake may be funny, it highlights the important point that AI often makes mistakes when working with materials from the Arab world, including images. For AI developers, there are reputational risks associated with ignoring cultural nuances, giving broad answers, or, worst of all, getting things completely wrong.

Researchers at Abu Dhabi’s Mohammed Bin Zayed University of Artificial Intelligence (MBZUAI) recorded errors when testing six “prominent” AI models, five of which were designed to work in Arabic. Because they are visual language models, they can interpret videos, photos, and text to create written descriptions and summaries.

The five Arabic-specific models (all open-source models and freely available to the public without payment) include some created by researchers in the UAE and others by emerging technology companies. One of the researchers, Karima Qadaoui, said the models may “know a little bit about Arab culture” even if they don’t understand the details of the culture.

“If you have an instrument that is very specific to a culture, the model will either describe it in very vague terms and just call it an instrument without using its actual name, or it will attribute it to a completely different culture,” she says.

“For example, there is a woman wearing a hat from northern Morocco. Most models don’t recognize it at all. They just say something vague like ‘hat’. Then some models get overconfident and claim it belongs to another culture. In this example, it’s a Mexican hat.”

Kadaoui, a doctoral student, said: The National Models say it is “common” for them to see a camel and mistake it for an ostrich. Halwa, an Omani sweet dessert, has been incorrectly identified in different ways by different models. One person described it as a nut-covered sweet that “could be a cake,” another suggested it was a pastilla, a Moroccan pie, and a third said it was a type of baklava dessert.

The researchers also discovered what they described as dialectical mixing, where the AI would start responding in Moroccan Arabic, but words in Egyptian Arabic would be mixed in. When making a response that was supposed to be in a particular dialect, the model often reverted to Modern Standard Arabic.

The university said in a statement that these examples highlight challenges for Arabic-enabled AI, including accurately recognizing objects, understanding their cultural context, accurately using specific Arabic dialects, and providing consistent responses.

Although the study focused on Arabic-enabled AI and objects from the Arab world, the researchers said mistakes could also occur when the AI identifies materials from other regions. They used Southeast Asia as an example.

For example, an AI model may fail when interpreting material from the Arab world because the initial annotation of the image was done by people from other parts of the world. This initial human annotation is critical as it provides the initial data used to build the model. For example, a model may become more culturally grounded if it uses a more comprehensive or diverse human-annotated dataset.

The researchers detailed their findings in a paper presented to the European branch of the Association for Computational Linguistics at a conference in Morocco in March. This paper was written by seven researchers from MBZUAI and three researchers from Toloka AI.

Another author, Hamdan Al Ali, a PhD student at the university, said part of the reason the model misidentified images was because of the way the images were analyzed.

“It’s a statistic and it determines what’s most likely to happen,” he said. “So, for example, an image of a camel could be identified as a llama. The model would say something like, ‘This animal has four points touching the ground. It has this particular color, a brownish color,’ and it would identify it based on that.”

“There are two differences in how humans distinguish a camel from a llama. Humans look at the hump on the camel.”

Qadaoui said the research could encourage AI developers to create more culturally sensible models.

“We are starting to have discussions about making these models more inclusive,” she said. “We must demand inclusivity. It rarely comes naturally. We must fight for it. We must demand it. And we must continue the work of exposing gaps and biases.”

Source link