When you try to hire a freelancer to write SQL and you get all the wrong AI garbage • The Register

Online labor marketplaces like Upwork have yet to develop meaningful policies governing the use of generative AI tools to bid on and fulfill posted jobs. According to machine learning firm Intuition Machines, that lack of clarity is putting these platforms at risk.

Recently, a research team at Intuition Machines’ hCaptcha bot detection service tested whether workers bidding on jobs posted on Upwork used generative AI tools such as ChatGPT to automate the bidding process. started. This trend is immediately apparent when searching on community forums. these services.

“The model for these platforms is to allow job applicants to win multiple bids,” said the researchers in a report provided. register. “So revenue is determined by the number of jobs someone bids on and the time it takes to respond.”

The hCaptcha report argues that this creates an incentive for job applicants to automate their side of the bidding process.

To test this theory, hCaptcha researchers designed a domain expert to answer within 5 minutes and found it to be inaccurate when answered by a known LLM. Created a job post with screening questions that

This question was created from an old article on anomaly detection using SQL. A prospective bidder was given her two-column sample data structure, including column types, and asked to construct a valid query to find anomalies. A general formula was suggested, but not required. The resulting answer had to run on his ClickHouse, an open source database for real-time apps.

The company planned to hire the person who provided the correct answer to write a tutorial on the subject. The job ad stated that responses would be validated and that the LLM would not provide a valid response, so the applicant should not bother submitting her LLM-generated response.

Of the 14 original bids submitted, 9 answered screening questions, so the researchers were ultimately unable to hire any. Of those, 9 responses were all LLM-generated and all were incorrect, indicating hallucinogenic features, hallucinatory columns, and other errors.

“We have been working on generative AI for many years, both in use and abuse,” said Eli-Shaoul Khedouri, founder and CEO of Intuition Machines. register. “What happened last year is that the performance of these large language models significantly outperformed the systems put in place to detect them.

“Considering what Upwork did years ago, they have all sorts of spam detection features to keep people from circumventing their policies, but as far as the current generation model is concerned, they are completely ineffective.”

Keduri said the hCaptcha team has found this to be the case on other sites as well, but the data has not yet been made public. “It’s not impossible to fix this problem, so we thought it would be a good way to draw attention to this problem.”

As hCaptcha explained on their website last month, there are valid techniques for detecting LLM output.

“If you are using something like standard screening techniques, or relying on messages sent to determine the veracity of a profile or e.g. need to be re-evaluated,” Keduri said. We basically just determined that 100% of people are going to use these tools right now. This means that you are not measuring the performance of your users, but the performance of these models.

“In this particular case, we determined that there was no human value added. None of the respondents exceeded the value added by the model.”

register Based on such a small sample, we cannot conclude that everyone on these platforms are using AI tools, but we can be sure that many of the participants do.

Keduri said many people seem to have given up on detecting the involvement of automated systems, but he doesn’t think it’s justified. “It’s not that they can’t do it,” he insisted. “But they need to really think that this is a real problem and put something in place. Otherwise the value of a platform that didn’t respond or didn’t do a good job is fundamental. because it will go down to

While the use of language models at scale continues to be a hot topic in various online forums, efforts to automate the work predated the current AI boom. Before large language models worked this well, talk about automating jobs was regularly published in various online posts and news articles. These were often well received and sparked lively debate about the ethics of exposing that a particular set of tasks can be handled by code.

LLM and other machine learning models are performing better and easier to run, despite the fact that so many people are still working remotely and often with limited oversight This makes job automation seem practical across a wide range of tasks. Models can interact with other computing and network services.

Another online freelance platform, Fiverr, warns that the service is struggling to keep up with ChatGPT, among many posts examining the impact of AI models. Responding to a recommendation that buyers should hold Zoom meetings with sellers to ensure that they can communicate without AI assistance, freelance writer Vicky Ito argued that the problem is not just communication, but quality.

“In the last month alone, many buyers have come to me to correct or rewrite entire content written on ChatGPT,” said Ito, who acknowledged himself as the author of the post. rice field. register. “In all these cases, the seller has promised native-level fluency in English, and in all these cases, the buyer quickly finds the job to be of no use to them. rice field.”

“These buyers then approached me with a diminished sense of trust, requiring additional ‘evidence’ that I was indeed fluent in English and that my writing was done by hand.” I was doing it. ”

Fiverr did not immediately respond to a request for comment.

In January, Upwork’s community manager said, “Upwork freelancers should clearly disclose to their clients when artificial intelligence was used to create content such as job suggestions and dash messages.”

However, a month later, a member of the Upwork community asked for clarification on ChatGPT’s status. “At Upwork, it will be used to hide the fact that freelancers are unskilled and deceptive, which will cause confusion on the client’s site,” said a person calling herself “Jeanne H.” rice field.

As of March, a community manager described Upwork’s policy as a recommendation, saying, “At this time, Upwork does not explicitly endorse or prohibit the use of AI. It’s up to you and your client to choose or not.” discussion. “

register Upwork asked for comment on the impact of generative AI tools and policies regarding the use of those tools.

The average number of weekly search queries related to generative AI in Q1 2023 increased by 1,000 percent compared to Q4 2022, and the average number of weekly job posts related to generative AI increased over the same period. We’ve received reports of an increase of over 600 percent.

“To meet this explosive demand, we have continued to update our Talent Marketplace to reflect exciting new skills and roles, such as Immediate Engineer, and added new Project Catalog job categories. , the total number of categories on Upwork has exceeded 125,” an Upwork spokesperson said.

He also pointed to policy changes announced in April. The company has revised the optional service agreement, terms of use, and terms of use to clarify how and when to use generative AI.

Optional terms allow buyers to contractually prohibit tools such as ChatGPT. And now our Terms of Use states that “Generated AI or other tools to significantly enhance job offers or deliverables if their use is restricted by the client or violates the rights of third parties. There is a fraud clause that prohibits the use of

“Ultimately, it is up to the client and the freelancer themselves, and based on the terms of the contract, to decide whether a generative AI tool is right for their project,” states Upwork’s policy. ®

Source link