For the first time, AI analyzes language like human experts

AI News


original version of this story appeared in Quanta Magazine.

Among the countless abilities that humans possess, what is unique to humans? Language has been the leading candidate, at least since Aristotle, who wrote that humans are “animals with language.'' While large-scale language models like ChatGPT superficially reproduce normal speech, researchers want to know whether there are certain aspects of human language that are completely unlike the communication systems of other animals or artificial intelligence devices.

In particular, researchers have investigated the extent to which language models can make inferences about language itself. For some in the language community, language models are simply please don't they have reasoning abilities can't do it. This view was summarized by renowned linguist Noam Chomsky and two co-authors in 2023 when they wrote: new york times “Proper explanations of language are complex and cannot be learned by simply immersing them in big data.'' AI models may be adept at using language, but they cannot analyze it in sophisticated ways, the researchers argued.

Image may contain, book, indoor, library, publication, adult, person, furniture, bookshelf, face and head

Gashpar Begush, linguist at the University of California, Berkeley.

Photo: Jamie Smith

This view is challenged in a recent paper by Gashpar Begush, a linguist at the University of California, Berkeley. Maximilian Dombkowski, who recently received his PhD in linguistics from Berkeley. and Ryan Rose of Rutgers University. Researchers subjected a large number of large-scale language models (LLMs) to a full range of language tests. This included a test to generalize the rules of the language created in the LLM. Although most LLMs were unable to parse language rules like humans, one LLM had an amazing ability that far exceeded expectations. He was able to analyze language in much the same way that linguistics graduate students do, schematizing sentences, resolving multiple ambiguous meanings, and exploiting complex linguistic features such as recursion. Begush said the discovery “challenges our understanding of what AI can do.”

The new study is both timely and “very important,” said Tom McCoy, a computational linguist at Yale University who was not involved in the study. “As society becomes more reliant on this technology, it becomes increasingly important to understand where it succeeds and where it may fail.” Linguistic analysis is an ideal testbed to assess the extent to which these language models can reason like humans, he added.

infinite complexity

One of the challenges of subjecting a language model to rigorous language testing is determining whether the model doesn't already know the answer. These systems are typically trained on large swathes of the Internet, as well as vast amounts of written information in dozens, if not hundreds, of languages, and even linguistics textbooks. In theory, a model could simply remember and regurgitate the information given to it during training.

To avoid this, Begush and his colleagues created a four-part language test. Three of the four parts involved asking the model to analyze specially constructed sentences using tree diagrams, which were first introduced in Chomsky's groundbreaking 1957 book. syntactic structure. These diagrams divide sentences into noun phrases and verb phrases, and further subdivide them into nouns, verbs, adjectives, adverbs, prepositions, conjunctions, and so on.

Part of the testing focused on recursion, the ability to embed phrases within phrases. “The sky is blue” is a simple English sentence. “Jane says the sky is blue” embeds the original sentence into a slightly more complex sentence. Importantly, this recursion process can continue forever. “Maria wondered if Sam knew that Omar had heard Jane say the sky was blue.” is also grammatically correct, although it is an awkward reflexive sentence.



Source link