This post is written in collaboration with Tatia Tsmindashvili, Ana Kolkhidashvili, Guram Dentoshvili and Dachi Choladze.
Impel transforms automotive retail through AI-powered customer lifecycle management solutions that drive dealer operations and customer interaction. Their core product, Sales AI, offers all-day personalized customer engagement, handling vehicle-specific questions, and car trade-in and financing inquiries. By replacing existing third-party leading language models (LLM) with a tweaked metalama model deployed on Amazon Sagemaker AI, Impel achieved 20% accuracy and improved cost management. Implementation using Amazon Sagemaker's comprehensive feature set, including model training, activation awareness weight quantization (AWQ), and large-scale model inference (LMI) containers. This domain-specific approach not only improves output quality, but also enhances security and operational overhead compared to general purpose LLM.
In this post, we share how Impel enhances the customer experience of car dealers using LLMS tweaked in Sagemaker.
Impel's AI
Automotive retailers optimize how they connect with customers and act as digital concierges for vehicle owners by providing personalized experiences at every touchpoint, from initial research to purchase, service to repeat businesses. Sales AI uses generated AI to provide customers with 24 hours a day-to-hour instant responses via email and text. This will maintain engagement in the early stages of a customer's car buying journey, leading to direct connections with showroom reservations and sales teams. Sales AI has three core features to provide this consistent customer engagement.
- summary – Summary of past customer engagement to derive customer intent
- Follow-up generation – Provide consistent follow-up to enthusiastic customers to prevent a deadly customer buying journey
- Response Personalization – Personalize your answers and tailor them to your retailer's message and customer's purchasing specifications
Two important factors have facilitated the migration from existing LLM providers. The need for model customization and cost optimization. The per-token pricing model of their previous solutions has become more cost-controlled as the volume of transactions increases, and tweaking restrictions prevents them from using their own data entirely for model improvement. By deploying a finely tuned metalama model in Sagemaker, Impel achieved:
- Cost predictability with hosted pricing, easing per token fees
- More control over model training and customization, leading to 20% improvement across core functions
- Secure handling of your own data within your AWS account
- Auto-scaling to meet inference demand spikes
Solution overview
Impel chooses SageMaker AI, a fully managed cloud service that builds, trains and deploys machine learning (ML) models using AWS infrastructure, tools and workflows, and fine-tunes the metallama model for sales AI. Meta Llama is a powerful model and is suitable for industry-specific tasks thanks to its strong teaching capabilities, support for extended context windows, and efficient handling of domain knowledge.
Impel used the Sagemaker LMI container to deploy LLM inference to the endpoints of Sagemaker. These dedicated Docker containers support LORA fine-tuning models and AWQ to provide optimized performance for models such as metalamas. Impel used Lora fine-tuning, an efficient and cost-effective technique for adapting LLMS to specialized applications via an Amazon Sagemaker Studio notebook running on an ML.P4DE.24XLARGE instance. This managed environment simplifies the development process and allows Impel teams to seamlessly integrate popular open source tools such as Pytorch and Torchtune for model training. For model optimization, AWQ technology was applied to reduce the size of the model and improve the performance of inference.
In production, the IMPEL deployment endpoint of the ML.G6E.12XLARGE instance is equipped with four NVIDIA GPUs and high memory capacity suitable for efficiently delivering large-scale models such as metalamas. Impel used the Sagemaker built-in autoscaling feature to automatically scale serving containers based on concurrent requests.
The following diagram shows a solution architecture, model tweaking and customer inference.
Impel's sales AI reference architecture.
Impel's R&D team has partnered closely with a variety of AWS teams, including the Accounts team, the Genai Strategy Team, and the Sagemaker Service team. The virtual team worked together on multiple sprints leading up to the launch date of the fine-tuned sales AI to optimize model evaluations, benchmark sage maker performance and scaling strategies to identify the best sage maker instances. The partnership included technical sessions, strategic alignment meetings, and post-implementation costs and operational discussions. The close collaboration between Impel and AWS helped us to realize our full potential in Impel's tweaked model hosted on Sagemaker AI.
Fine-tuned model evaluation process
Impel's move to a fine-tuned metalama model has brought improvements across key performance metrics with significant improvements in understanding car-specific terminology and generating personalized responses. Structured human assessment revealed enhancements in key customer interaction areas: personalized replies improved accuracy from 73% to 86%, conversation summary increased from 70% to 83%, and follow-up message generation showed the most important gain jumping from 59% to 92%. The following screenshot shows how customers interact with sales AI. The model evaluation process involved Impel's R&D team scoring the various use cases offered by current LLM providers and Impel's fine-tuned models.
Examples of customer interaction with sales AI.
In addition to output quality, Impel measured latency and throughput to validate the model's production readiness. Using AWSCURL for HTTP requests signed to SIGV4, the team saw these improvements in real-world performance metrics to ensure the optimal customer experience in the production environment.
Use domain-specific models for improved performance
Impel's advances in AI sales have moved from general-purpose LLM to domain-specific fine-tuning models. We used anonymized customer interaction data to fine-tune the published basic model to bring some important improvements. This new model has increased accuracy by 20% across core features, enhanced understanding of the automotive industry, and demonstrated more efficient context window usage. By moving on to this approach, Impel achieved three major advantages:
- Improved data security through internal processing within AWS accounts
- Reduced dependency on external APIs and third-party providers
- Greater operational control for scaling and customization
These advances, coupled with significant output quality improvements, examined Impel's strategic shift towards domain-specific AI models for sales AI.
Expanding AI innovation in automotive retail
Impel's success in deploying a tweaked model in Sagemaker has established a foundation for expanding AI capabilities, supporting a wider range of use cases tailored to the automotive industry. Impel plans to move to an in-house domain-specific model to extend the improved accuracy and performance benefits across its customer engagement product suite. The Impel'sR&D team is advancing AI capabilities by incorporating search and extended generation (RAG) workflows, advanced feature calls, and agent workflows. These innovations help to provide an adaptive context-aware system designed to interact, reason, and act across complex automotive retail tasks.
Conclusion
In this post, we discussed how Impel enhanced the car dealer customer experience with LLM tweaked with Sagemaker.
For organizations looking to move to a similar model, Impel's experience shows how working with AWS can help achieve both improved accuracy and model customization opportunities while building long-term AI capabilities tailored to the needs of a particular industry. Connect with your account team or visit Amazon Sagemaker AI to learn how SageMaker can help you deploy and manage tweaked models.
About the author
Nicholas Skozafava He is a senior solution architect at AWS and focuses on startup customers. Before his current role, he helped enterprise customers navigate their cloud journey. He helps to build and scale cloud infrastructure, automation, DevOps, and AWS.
Sam Sudakov I'm a senior account manager at AWS and focuses on strategic startup ISVs. SAM specializes in technology landscapes, AI/ML, and AWS solutions. Sam's passion is about driving startup scaling and SaaS and AI conversion. In particular, his work with AWS' top startup ISV focuses on building strategic partnerships and implementing initiatives to address the market that bridges enterprise technology to innovative startup solutions while maintaining strict compliance with data security and privacy requirements.
Vivek Gangasani Lead Specialist Solution Architect for AWS Inference. He uses AWS services and accelerated computing to help new generation AI companies build innovative solutions. Currently he focuses on developing strategies to fine-tune and optimize the inference performance of large-scale language models. During his free time, Vivek hikes, watches movies and experiments with a variety of dishes.
Dmitry soldatkin We are AWS Senior AI/ML Solutions Architect, helping our customers design and build AI/ML solutions. Dmitry's work covers a wide range of ML use cases and has a major interest in generating AI, deep learning, and scaling ML across the enterprise. He has supported companies in many industries, including insurance, financial services, utility and telecommunications. Prior to joining AWS, Dmitry was an architect, developer and technology leader in the data analytics and machine learning sector in the financial services industry.
Tatia Tsmindashvili Impel is a senior learning researcher with a master's degree in biomedical engineering and medical informatics. She has over five years of experience in AI and has an interest in LLM agents, simulation and neuroscience. You can find her on LinkedIn.
Ana Kolkhidashvili He is the director of R&D at Impel and leads AI initiatives focusing on large-scale language models and automated conversation systems. She has over 8 years of experience in AI and specializes in large-scale language models, automated conversation systems, and NLP. You can find her on LinkedIn.
Guram Dentoshvili He is Director of Engineering and R&D at Impel, leading the development of scalable AI solutions and driving innovation across the company's conversational AI products. He began his career as a machine learning engineer with Pulsar AI, playing a key role in building AI technology tailored to the automotive industry. You can find him on LinkedIn.
Nicky Choraze He is Impel's Chief Innovation Officer and leads AI strategy, innovation and product development initiatives. He has over 10 years of experience in technology entrepreneurship and artificial intelligence. Niggi was co-founder of Pulsar AI, Georgia's first globally successful AI startup, and later merged with Impel. You can find him on LinkedIn.
Deepam Mishra I'm an SR advisor for AWS startups and advises startups on ML, Generator AI, and AI safety and responsibility. Before joining AWS, Deepam co-founded and led the AI business with Microsoft Corporation and Wipro Technologies. Deepam was a serial entrepreneur and investor who founded four AI/ML startups. Deepam is based in the NYC Metro area and enjoys meeting AI founders.
