Google uses tools that collect training data for AI-driven applications, such as self-driving cars, from interactions with the internet.Companies may call us all “users” or “customers,” but that hides what they really do
opinion: While there is growing debate and concern about the impact of artificial intelligence on our daily lives, work and recreational activities, one aspect of it escapes most commentary. To do what humans do, AI systems need to learn from humans. But who is teaching them? Almost everyone.
Every day, on different occasions, we humanize our own AI systems by performing “tasks” on our phones and laptops. AI companies are very good at making us work for them without us even realizing it. The hundreds of small tasks we perform every day as we browse the internet or interact with our mobile devices are being exploited by AI companies to train their systems and learn to do what humans do.
read more:
* AI chatbots revolutionize education assessment
* Can AI be relied on as a work-life coach?
* Artificial Intelligence: Our Dystopian Future?
My colleagues and I have investigated these exploitative practices in greater depth. The conclusion of our study, “Unconscious Workers: AI Training to Humanize,” was clear. AI trainers (and thus all of us) are currently exploited as unpaid workers.
You may know that most AI applications are based on a technique called machine learning. Using this technology, machines are fed a large amount of application-specific data about human abilities they have been taught to emulate, and then perform human tasks (playing chess, composing songs, drawing pictures, etc.). ) can be learned. This is more complicated than what most of us think of as data. The data includes already known information such as age, gender, location and buying habits. But data also contains the results of a myriad of human activities that companies secretly oblige us to do.
If you’re a music lover, even if you don’t use Spotify (good luck), chances are Spotify can use your musical expertise and tastes to improve its own recommendation algorithm.
One of the most common of these data generation tasks involves reCAPTCHA. It’s those annoying online tests that get in the way when we try to log into or access pages on online services and, ironically, require us to prove we’re not robots. . An image of the road will be displayed and you may be asked to identify which part of the image has traffic lights.
The tool was originally implemented to distinguish between humans and bots and prevent bots from accessing services, but since 2014, Google has used it to collect training data for AI-driven applications such as self-driving cars. I am using this tool. (If you’re like me and like to trick reCAPTCHA by clicking a fire hydrant when prompted to identify a traffic light, you could be responsible for a self-driving car accident in the future.)
Another notable example of misuse by AI companies is the content recommendation system, one of the most human bloodthirsty villains in the story. For example, Spotify feeds human-generated data into its system to recommend new music to its users. If many users put two of the same girlfriend’s songs in the same playlist, Spotify’s AI will learn that the songs likely have something in common, and use this information to curate playlists and Suggest songs to users.
For example, if you create a playlist of “happy songs,” a playlist curator could look for songs that people tend to put in playlists called happy, and identify the characteristics of what constitutes a “happy song.” If you’re a music lover, even if you don’t use Spotify (thank goodness), Spotify could use your musical expertise and tastes to improve its own recommendation algorithm. .
Spotify employs thousands of bots, or crawlers (they’re as mean as they look), to automatically read and analyze all kinds of music blogs and reviews to see how music is described and what people think about it. We are making sure that we are making such connections. Simply put, a recommendation system makes a profit by recommending to you what it has learned from someone else somewhere, and can legally extract humanity from that person without permission. can.
Of course, you’ve heard of AI systems that generate text, music, visual art, or other forms of output from a few text prompts. Chat-GPT made headlines this year, but there are many systems that generate human-like content. Again, all of this happens because we are all unknowingly training them like humans.
The training data for these systems comes from image datasets published on the Internet, such as ImageNet. These images are “scraped” (automatically identified and downloaded) from websites such as Flickr, YouTube and Instagram. These images include photographs and photographs taken. Drawn or painted art uploaded to the internet by a human (professional or otherwise).
Even the way we filter spam in our inbox is a form of AI training data. With millions of users receiving billions of emails every day, we have a significant corpus of labeled data available for spam filters to classify as spam or non-spam. This is a form of collaborative filtering similar to recommendation engines, but used to remove content rather than promote it.
The ethical issue with all these systems is that the individuals who interact with them are largely unaware of these data collection practices and the value they add, and that companies are making huge profits from us. Even though they are getting it, those systems are not being rewarded. Our contention is that providing unpaid labor without agreement or compensation and creating surplus value for private enterprises is a form of labor exploitation.
Companies may refer to all of us as “users” or “customers,” but that is a way of masking ongoing labor exploitation. The power imbalance between unwitting workers and tech companies is immense. Our society allows business models based on exploitation to flourish, and it should be in the government’s interest to correct this power imbalance.
The research described in this article was conducted by Dr. Fabio Moreale in collaboration with Dr. Elham Bamantimouri, Dr. Brent Burmester, Dr. Andrew Cheng, and Dr. Michelle Thorpe, and was published by Springer, AI & Society. and is freely accessible from this link.
