AI agents with large language models are transforming enterprise workflows, but high inference costs and latency can limit scalability and user experience. To address this, Nvidia recently announced the Nvidia AI Blueprint for building a data flywheel. This is an enterprise-ready workflow that helps you find efficient models that optimize AI agents with automated experiments to reduce inference costs while improving potential and effectiveness.
At the heart of the blueprint is a self-improvement loop that uses NVIDIA NEMO and NIM microservices to distill, fine-tune and evaluate small models using real production data.
Data Flywheel Blueprints are designed to seamlessly integrate with existing AI infrastructure and platforms, supporting multi-cloud, on-prem and edge environments.
Steps to implement a Data Flywheel Blueprint
This practical demo shows how to use the Data Flywheel blueprint to optimize the model that performs functions and tool calls for virtual customer service agents. We explain how the data flywheel can help replace the larger llama-3.3-70b model with a much smaller llama-3.2-1b model without compromising accuracy, but reduces inference costs by more than 98%.
1. Initial setup
- Bootable using Nvidia, quickly spin up any required GPU calculations
- Deploy NEMO Microservices for model customization and evaluation loops
- Use NIM microservices to provide models via API
- Clone the Data Flywheel Blueprint GitHub repository
2. Take logs and curate
- Collect production agent interactions in OpenAI compatible format
- Save logs to elasticsearch
- Set up the built-in Flywheel Orchestrator Tag, deduplication, curate task-specific datasets, and run continuous experiments
3. Try out the existing model and the new model
- Run Ebal with zero shot, context learning, and fine-tuned setup
- Fine-tune small models using production output and rora – no manual labeling
- Measure accuracy and performance by incorporating tools such as MLFLOW
- Select the model to match or outperform to the original baseline
4. Continuously expand and improve
- View the generated evaluation report
- Develop efficient models that have been surfaced in production
- Continue to improve through automated experiments by ingesting new production data, retraining, repeating flywheel cycles
Start your NVIDIA AI Blueprint for building your data flywheel by watching this new how-to video or downloading it from the NVIDIA API catalog.
