Fine-tune LLM for your data using Snowflake ArcticTraining and ML jobs

ArcticTraining supports a wide range of techniques beyond full fine-tuning, including LoRA (Low Rank Adaptation) fine-tuning and Arctic Long Sequence Training. The new LLM Tweak Quickstart Guide includes recipes for both full tweaks and LoRA tweaks.

under the hood

ML jobs run on Snowflake Container Runtime, which is preconfigured with GPU drivers, ML frameworks, and Ray. ArcticTraining directly interfaces with Ray for multi-node training and coordinates batch distribution and gradient synchronization between workers. ArcticTraining also integrates with the DataConnector API to efficiently stream table data directly to training workers.

Each ML job mounts a stage volume that exposes the internal Snowflake stage as a filesystem path within the container. ArcticTraining discovers and uses stages mounted at model checkpoints to persist model weights within Snowflake for retrieval and evaluation.

Why LoRA?

Full fine-tuning updates all model weights. For a typical LLM, this means training tens or even hundreds of billions of parameters. This requires a large amount of GPU memory and can take several hours even on high-end hardware.

LoRA takes a different approach. Freeze the pre-trained model and inject a small trainable low-rank matrix into the transformer layer, typically training only 0.1% to 1% of the original parameters. The result is faster training, lower memory usage, and smaller adapter files (often tens of megabytes) that can be swapped during inference. You can train multiple adapters for different tasks and serve them from a single base model.

Evaluation of results

Assessing the quality of text generation is a difficult problem. Traditional text evaluation methods such as BLEU and ROUGE cannot account for paraphrases and synonyms, nor can they account for tone or semantic meaning. On the other hand, human evaluation is subjective and prohibitively expensive at scale. Although the state of the art in this regard is constantly evolving, the most successful approach currently is to use an LLM-as-judge approach, where a powerful supervised model evaluates the generated output against ground truth or predefined scoring criteria. This allows for more scalability than human annotation and more nuanced evaluation than simple text matching.

learn more

We are making significant investments to make advanced ML technology available to all Snowflake customers.

Integrated experiment tracking Compare runs, tune hyperparameters, and manage model versions

Seamless deployment From training to production inference within Snowflake

reinforcement learning For further performance improvement

Fine-tuning LLM based on your own data doesn’t require a dedicated infrastructure team or risky data exports. Not so with ML Jobs and ArcticTraining.

Ready to try it for yourself? We’ve published a quickstart guide that walks you through the complete process from data preparation to training to evaluation. This pattern applies to any domain with unique text and structured output requirements, such as financial reports, legal documents, and customer support. Check out our guide.

Beyond the quick start

Our quickstarts prioritize accessibility. It can run in under an hour on mid-tier GPUs, making it easy to experiment and iterate. To fine-tune your production workloads, we’ve compiled a list of recommended optimizations you should consider to achieve more optimal performance in real-world scenarios.

Larger base model: We used Qwen3-1.7B for faster training and more memory efficiency. Scaling up the parametric model to 8B or 14B provides stronger baseline functionality and may result in better fine-tuned performance, especially for complex medical inferences.

Other training data: of max_length Training recipe settings exclude long dialogs to fit within GPU memory constraints. Increasing this limit (or using gradient checkpoints to handle longer sequences) retains more training examples and exposes the model to more diverse clinical scenarios.

Extended training: We trained for just 1-2 epochs to quickly demonstrate the workflow. Running longer training runs with learning rate scheduling and early stopping based on validation metrics can help your model converge more completely.

Tuning hyperparameters: LoRA rank, learning rate, and batch size all affect the final quality. Systematic experimentation, easily enabled by the reproducible submission of ML Jobs, can yield meaningful benefits.

Forward-Looking Statements: This content contains forward-looking statements, including regarding future product features. These statements are not promises to deliver any material, code or functionality, and actual results may vary.

Source link

create binance account commented on Telco leaders join forces to discuss next steps towards highly autonomous networks: Your point of view caught my eye and was very inte
最佳Binance推荐代码 commented on New Microsoft Teams App is Now Available: I don't think the title of your article matches th
"oppna ett binance-konto commented on Why the Apple UK hiring spree “makes sense” for the company: Your article helped me a lot, is there any more re
Реферальная программа binance commented on Amazon, Google Among Firms Focusing on AI Lobbying in States: I don't think the title of your article matches th
slotvip commented on Apple and Salesforce respond to YouTube video complaints: What's up to all, it's actually a good for me to p

Fine-tune LLM for your data using Snowflake ArcticTraining and ML jobs

under the hood

Why LoRA?

Evaluation of results

learn more

Beyond the quick start

RECENT POSTS

Rio Tinto documents 30 years of manufacturing systems using AI

AI risks come from outdated workflows, not tools: Clip

A practical machine learning skill stack — what you actually need to build real models

under the hood

Why LoRA?

Evaluation of results

learn more

Beyond the quick start

Related Posts