Domain-specific AI beats popular models in business applications

Visma's AI team is quietly redefine document processing across Europe. With nearly a decade of background, Visma Machine Learning Assets uses models to process over 18 million documents per month. Ultimately, they power key business processes through highly specialized AI models. What began as an effort to streamline accounting workflows has grown into a wide range of initiatives that combine modern AI with real business needs.

Like many AI teams in the mid-2010s, Visma's group initially relied on traditional deep learning methods such as recurrent neural networks (RNNS), similar to the systems that powered Google Translation in 2015. “We scrapped all our development plans and since then we've only transformers,” says Claus Dahl, director of ML Assets at Visma. “We realized that Transformers are the future of language and document processing, so we decided to rebuild the stack from scratch.”

This shift occurred before transformer-based systems like CHATGPT hit the mainstream. The team's first transformer-powered product has entered production in 2021, and Visma has managed to get a valuable head start with the adoption of cutting-edge NLP technology.

The team's flagship products are robust document extraction engines that handle documents in the countries in which Visma companies operate. It supports a variety of languages. AI can be used for documents such as invoices and receipts. The engine identifies key fields such as dates, totals, and customer references and supplies them to the direct accounting workflow.

Complementing this is a tool that automatically labels transactions using learned business behavior. This system adapts to individual organizations and suggests appropriate account numbers, department codes, and other required metadata based on past activities.

What users experience as a seamless interface is actually a federation of around 50 specialist models working together under the hood. Each model is tailored to a specific task or data structure and is dynamically selected according to the query type. This modularity ensures optimal performance without compromising the user experience.

Multilingual design

A man wearing glasses and a plaid shirt stands indoors in a modern building with large windows and natural light, reflecting the innovative environment of Bisma. — Claus Dahl, Director ML Asset for Visma

Language diversity has been built into the system from day one. The team tested the model in about 20 different languages, achieving strong results even in low resource contexts. While English remains the most robust due to its rich training data, Dahl emphasizes that the system handles multilingual scenarios well, especially for simple information extraction tasks. “Even if the document is in Vietnamese, you can still find the date and amount,” he points out. “It becomes more complicated when you try to extract something like a reference to a project. That's why you need a deeper context and a language-specific understanding.”

To support this, the team relies on a multilingual foundation model. This allows users to interact with the system in a preferred language, whether they upload a document or query the content.

Efficient training of impact models

One of the most compelling aspects of Visma's approach is its training strategy. Instead of chasing large datasets, teams prioritize quality over quantity. Every year, they process around 200 million documents from around 100 countries. These documents form the basis for a large general model that learns to understand a wide range of document formats and layouts. Once this foundation is in place, small, highly specialized models can be trained using just 50 carefully selected examples.

“High-quality data is more valuable than a large amount of data. We invested in a dedicated team that ensures accuracy to curate these datasets, meaning we can tweak the models very efficiently,” explains Dahl. This strategy reflects the scaling laws used in large language models, but tailors them to targeted enterprise applications. This allows teams to iterate quickly and achieve high performance in niche use cases without excessive computational costs.

Bridge automation and insights

As AI matures, document processing is no longer about automation. More and more, it is to provide real-time insights. Visma's system can now achieve error rates between 1-3%, close to human-level performance. This accuracy is achieved through a double layer quality monitoring approach. One team tracks user performance metrics, while another team tracks documents in real time. Together, they provide the monitoring needed to ensure consistency across thousands of business cases and formats.

“Accuracy is important, but so is adaptability,” Dahl points out. “All businesses are slightly different, so we focus on being more flexible enough to learn AI from its context.” This is to provide customers with software solutions that could replace manual tasks. Additionally, helping businesses extract the meaning of documents, making more informed decisions faster.

Addressing language and formatting challenges

Despite their success, challenges remain. Using documents in unfamiliar languages or highly localized formats can complicate extractions. Standard fields such as dates and amounts are often translated between languages, but subtle elements such as contract references and project numbers require a deeper semantic understanding. This is especially true when the format is inconsistent.

“There's a huge difference between reading a standard invoice and interpreting a free-form contract,” says Dahl. “Therefore, professional training and contextual awareness become essential.” To overcome these challenges, teams rely on the ability of the trance model to infer meaning from structure and context, rather than purely relying on keyword matching or templates.

Towards more autonomous AI systems

One new trend the Visma team is exploring is expanding the autonomy of AI systems. Most AI works with short bursts, for example, document processing or transaction processing, but the goal is to develop a system that can maintain coherent operations over the long term. This reflects the development seen in software agents, but comes with hurdles. Unlike the public code repository used to train coding AIS, most business data is confidential.

“There aren't many companies that have all their accounts on the internet, so we have to find creative ways to train our models while respecting privacy,” Dahl points out. Still, this ambition leaves us with the creation of AI that can be inferred over time, drawing insights throughout the document and acting as a true business partner.

Contrary to the fear that AI will replace workers, Dahl argues that the opposite is happening in business management. There is a shortage of qualified professionals, and AI is helping to fill that gap. “I've heard that accountants would let go of their clients because they couldn't provide a useful service,” he says. “AI allows these companies to handle more clients without compromising quality.”

Towards AI preparation

The conversations about AI in business have also evolved. Risk assessments and legal concerns dominated early debate. Today, many experts experience AI firsthand through translation tools, content generation apps, and even casual interactions. Now, companies approach AI with practical expectations and assess it based on performance, ease of integration and return on investment. “We've overcome the hype,” Dahl recalls. “People are asking: Does it work for my business and how fast can it be used?”

Visma's AI Journey demonstrates how a focused, professional approach to machine learning can deliver results at scale. By prioritizing efficiency, multilingual support, and legal compliance, the team has built the foundation for intelligent automation. As AI systems become more autonomous, contextual and integrated, they are evolving into tools that help businesses make smarter decisions.

Tip: Software development AI: From experiments to standards

Source link