Achieve 10,000x training data reduction with high fidelity labels

experiment

I wanted to understand which models and tasks would benefit most from the curation process. As Baseline In the experiment, we used crowdsourcing labels to fine-tune two LLMs of different sizes (Gemini Nano-1 with 1.8B parameters, and NANO-2 with 3.25B parameters) for two tasks of different complexity (based on expert alignment). Each crowdsourced dataset has a strong class of imbalance with ~100k annotations and a strong class of imbalance, with an average of around 95% benign labels.

Each of these four baseline conditions was compared with the corresponding one Curation Conditions where each model (NANO-1 and NANO-2) is fine-tuned in multiple rounds using the curation process above. For each iteration, we selected a set of curated examples and used them for model evaluation and fine-tuning as above. All models stopped before reaching comparable to the expert internal alignment, thus stopped at six iterations (~400 fine-tuning and ~250 evaluation samples) due to lower complexity and five iterations (~250 fine-tuning and ~150 evaluation samples). (Note that lower complexity tasks have more and more different examples, which may explain the long time required to converge.) Both datasets had positive examples with a balance of approximately 40% for the final class.

The following table provides an overview of the scale and quality of the data used in each condition. Experts reached the average pairwise Cohen kappa (lower complexity task) and .78 (upper complexity tasks) through the curation process. We consider these to be the ceiling of the model's performance. To assess the quality of crowdsourced data, crowdsourced annotations and kappa alignments between experts were calculated. This is based on a complete curation set of .59 (lower complexity) and .41 (higher complexity).

Source link

www.binance.bh registrera dig commented on Cloud technology’s potential impact on New Zealand economy: Your point of view caught my eye and was very inte
www.binance.bh registrera dig commented on Top 5 jump ropes for weight loss in India: Can you be more specific about the content of your
binance Sign Up commented on Passing Fad or the Future of Programming?: Can you be more specific about the content of your
Binance推荐码 commented on MEGA sconto del 34% su Amazon: Can you be more specific about the content of your
binance anm"alningsbonus commented on CX Decoded Podcast Episode 2: AI Empowered CX: Real Conversations, Real Results: Shri Nandan, Comcast: Can you be more specific about the content of your

Achieve 10,000x training data reduction with high fidelity labels

experiment

Leave a Reply

RECENT POSTS

Real-life ways small firms use AI

Surprising Twist That Entire Families Are Now Opting To Use AI For Their Mental Health Guidance

Announcing the Agentic Catalog Experience in Amazon Quick

experiment

Related Posts

Leave a Reply