Focus on generative AI trained by generative AI

Machine Learning


Training AI systems to perform specific tasks accurately and reliably requires large amounts of data. Harnessing human labor to label his AI programs is a key part of training machine learning models for specific applications such as self-driving cars. Workers from countries such as India and Sri Lanka, and workers from Mechanical Turk, an Amazon-created platform, will be his two sources of labor for this work.

Training AI systems to perform specific tasks accurately and reliably requires large amounts of data. Harnessing human labor to label his AI programs is a key part of training machine learning models for specific applications such as self-driving cars. Workers from countries such as India and Sri Lanka, and workers from Mechanical Turk, an Amazon-created platform, will be his two sources of labor for this work.

Many of today’s “machine learning” or “deep learning” programs, including image recognition for self-driving cars, rely on thousands of humans in India and Sri Lanka to label every photo, and AI programs use these human labels. I imagine that you can refer to each of the attachments. When attempting tasks such as recognizing traffic signs or differentiating between pedestrians and cyclists. The main advantage of using Indian or Sri Lankan labelers is cost efficiency. For the same amount of money, you may be able to hire more workers to complete the labeling job cost-effectively. An interesting consideration is the potential cultural and linguistic differences, a phenomenon I call “English to English” translation. Many Indians and Sri Lankans are fluent in English, but subtle linguistic nuances and culturally specific contexts can be missed. On the other hand, Amazon’s Mechanical Turk (MTurk) is his crowdsourced marketplace that connects “requesters” (people who need to perform tasks) and “workers” who are willing to perform them. MTurk has a vast workforce of diverse backgrounds and a global reach. This versatility is especially useful for tasks that require multilingual and multicultural knowledge.

Hi!reading premium articles

Many of today’s “machine learning” or “deep learning” programs, including image recognition for self-driving cars, rely on thousands of humans in India and Sri Lanka to label every photo, and AI programs use these human labels. You can think of it as a reference to a date. When attempting tasks such as recognizing traffic signs or differentiating between pedestrians and cyclists. The main advantage of using Indian or Sri Lankan labelers is cost efficiency. For the same amount of money, you may be able to hire more workers to complete the labeling job cost-effectively. An interesting consideration is the potential cultural and linguistic differences, a phenomenon I call “English to English” translation. Many Indians and Sri Lankans are fluent in English, but subtle linguistic nuances and culturally specific contexts can be missed. On the other hand, Amazon’s Mechanical Turk (MTurk) is his crowdsourced marketplace that connects “requesters” (people who need to perform tasks) and “workers” who are willing to perform them. MTurk has a vast workforce of diverse backgrounds and a global reach. This versatility is especially useful for tasks that require multilingual and multicultural knowledge.

MTurk’s flexible nature is also a big advantage. Employees can choose tasks that match their skill set and work on them at their convenience. As a result, requesters are usually able to complete their tasks relatively quickly. Additionally, MTurk’s integrated quality control mechanisms help ensure that work output is of a reasonable standard. The cost-effectiveness of using it depends on the complexity of the task. Simple jobs can be cost-effective, but complex jobs that require highly skilled workers can be more expensive than sourcing labor from low-wage countries. The anonymity and impersonal nature of platforms can also mean inconsistent quality and lack of accountability. Workers on the platform are paid per task, so they may be given rush work without due attention to quality, especially if the pay is low.

However, some new generative AI models use just a few samples, as opposed to thousands of samples and hours of additional training required for pre-“deep learning” models. You can prepare yourself for specific tasks. Computer scientists call this “learning a few times” and believe that GPT3 is the first real example of a powerful shift in how humans train machines. The system architect was able to provide some simple steps to get GPT 3 to write your own programs.

This throws the generative AI system into a completely different trajectory, but does not reduce its chances of error. In fact, they are known to have kinks such as bias and profane nature, and companies like OpenAI have been actively encouraged by ordinary users to improve their training models for better results. I am asking for your opinion.

That being said, danger still lurks. The first is from places like the dark web. I recently wrote about the “jailbreak” phenomenon of generative AI systems. In this phenomenon, an “ethical hacking” company named Adversa.AI hacked GPT 4, Google’s Bard, Anthropic’s Claude and Microsoft’s Bing Chat system. The efficiency of disrupting all these models with a single set of commands is astonishing and a disastrous lesson in the vulnerability of these systems (rb.gy/ovhdz).

But according to MIT, we now have news of more vulnerabilities. technology review (rb.gy/yrsox). Both offshore workers and her MTurk workers offer unique advantages in labeling her AI program, but establishing appropriate quality control mechanisms to ensure high quality data labeling. is important. This is because the quality of AI models is highly dependent on the quality of the input data, which starts with the human effort put into it.

Gig workers using platforms such as M-Turk appear to be using generative AI to complete tasks, the magazine said. A team of researchers from the Swiss Federal Institute of Technology employs 44 people on the GigWork platform to summarize 16 extracts from medical research papers, it reports. Responses were then analyzed using an AI model that looks for signals in the ChatGPT output. The team also extracted the employee’s keystrokes for other indicators that the generated responses came from elsewhere.

According to the magazine, the team estimated that between 33% and 46% of employees are using AI models such as OpenAI’s ChatGPT. It further quotes the researcher as saying, “Training his AI using AI-generated data could introduce more errors into an already error-prone model.” Large language models regularly present false information as fact. If it produces erroneous output that is used to train other AI models, that error can be absorbed by those models and amplified over time. “

It seems to me that the time has come for governments around the world to step in and regulate what could soon become a dangerous trend. But to be honest, I really don’t know where to start.

Siddharth Pai is co-founder of Siana Capital, a venture fund manager.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *