Many NLP applications, especially training classifiers and evaluating the effectiveness of unsupervised models, require high-quality labeled data. For example, scholars often try to classify texts into different thematic or conceptual categories, filter noisy social media data for relevance, or assess moods and positions. . Labeled data are required to provide a training set or benchmark against which results can be compared regardless of whether supervised, semi-supervised, or unsupervised methods are used for these tasks. Such data may be provided for high-level tasks such as semantic analysis, hate speech, and possibly more specialized goals such as political party ideology.
Researchers typically need to create their own annotations to ensure that labels correspond to concept categories. Until recently, there were only two basic approaches. For example, a research assistant can be hired by a researcher and trained as a coder. Second, she may rely on freelancers working on her website, such as Amazon Mechanical Turk (MTurk). These two approaches are frequently combined, with cloud her workers augmenting the labeled data and trained annotators generating small gold standard datasets. Each tactic has its own advantages and disadvantages. Training annotators are expensive to service but often produce high quality data.
However, there are concerns about the quality degradation of MTurk data. Other platforms such as CrowdFlower and FigureEight are no longer viable prospects for academic research after being acquired by business-focused organization Appen. Crowd employees are much more affordable and adaptable, but may be of better quality, especially for challenging activities and languages ββother than English. We investigate the potential of large-scale language model (LLM) text annotation tasks, with particular focus on the published ChatGPT. -shot ChatGPT classification outperforms them (i.e. without additional training).
LLM has performed very well on a variety of tasks, such as classifying legislative ideas, scaling ideologies, solving cognitive psychology problems, and emulating human samples for research studies. Several studies have shown that ChatGPT can perform the specified kind of text annotation task, but to the best of their knowledge, a full evaluation has not yet been performed. A sample of his 2,382 tweets collected for preliminary research was used for analysis. In that project, tweets were annotated by trained annotators (research assistants) for her five separate tasks: relevance, posture, subject matter, and her two types of frame identification.
They distributed jobs to MTurk’s crowd workers and ChatGPT’s zero-shot classification using the same codebook they created to train their research assistants. They then evaluated ChatGPT’s performance against his two benchmarks: (i) accuracy relative to crowd workers; (ii) intercoder agreement compared to both crowdworkers and their trained annotators; They found that ChatGPT’s zero-shot accuracy was higher than his MTurk’s in four tasks. ChatGPT outperforms MTurk and pre-trained annotators on all inter-coder agreement features.
ChatGPT is also much more affordable than MTurk. His five classification jobs in ChatGPT cost about $68 (25,264 annotations), while the same task in MTurk costs $657 (12,632 annotations). So ChatGPT costs only $0.003 (one-third of a cent), making him about 20 times more expensive than MTurk while offering superior quality. You can annotate the entire sample at this cost, or you can build a fairly large training set for supervised learning.
They tested 100,000 annotations and found it cost about $300. These findings show how ChatGPT and other LLMs can change the way researchers annotate data and upend some aspects of the business model of platforms like MTurk. . However, more research is needed to fully understand how ChatGPT and other LLMs work in the broader context.
check out paper. All credit for this research goes to the researchers of this project.Also, don’t forget to participate Our 17k+ ML SubReddit, cacophony channeland email newsletterWe share the latest AI research news, cool AI projects, and more.
Aneesh Tickoo is a consulting intern at MarktechPost. He is currently pursuing a Bachelor’s Degree in Data Science and Artificial Intelligence from the Indian Institute of Technology (IIT), Bhilai. He spends most of his time on projects aimed at harnessing the power of machine learning. His research interest is image processing and his passion is building solutions around it. He loves connecting with people and collaborating on interesting projects.