The difficult truth of making generative AI work

AI For Business


The business world is in the midst of the artificial intelligence gold rush. A clear sense of urgency grabbed the meeting room as the rapid ubiquity of generative AI was spurred. Leaders are rushing to integrate large-scale language models (LLMs) into operations, trying to understand exactly what these tools do and where they are heading.

The promise is fascinating. Ready-made intelligence, plugged in like a utility and ready to revolutionise productivity. However, as Sansan discovered in our journey to applying AI, this fascinating promise often stands in the difficult reality of certain high-stakes business problems.

Our experience reveals important lessons from this era by building a bespoke AI model from scratch. The true defensive value in enterprise AI comes from a painstaking first principled approach that bridges deeper understanding of business reality and deeper technical experts, rather than blindly adopting the latest trends.

As a company built on high-priced information extraction, we focus on evolving the use of AI to speed up day-to-day operations and workflows across solutions, including business cards and contacts (SANSAN), invoices (invoice 1), and contracts (Japanese solutions, agreement 1).

The task is tough. Errors are not a conversational habit. It's business responsibility. While general purpose AI is a marvel of common sense reasoning, it quickly discovered that its substitution of domain-specific expertise is poor.

For example, we could not reliably distinguish between the individual and business email addresses of business cards without clear and subtle instructions that the architecture was not designed to handle.

More fundamentally, these models failed critical technical tests. Most are built on foundations such as Openai clips that are pre-trained with relatively low-resolution images.

This is good for identifying cats in photos, but certainly not deciphering small, dense prints on complex bills.

The highly architecture of the tools available was misaligned with the fine-grained nature of our problem. What is “good enough” in the consumer world has not been a good place for the needs of our businesses.

This left us at a strategic intersection. You can continue to wrestletter with inadequate tools or take the great risk of building your own vision language model from the ground up. I chose the latter. This was not a decision born out of academic curiosity, but a calculated business strategy.

We have determined that the market risk is low. I knew that a more accurate system would create immeasurable value. However, the technical risks were immeasurable. We were challenging the unknown territory without guaranteeing success.

Our journey has been one of trials, mistakes and adaptations. The initial attempt to build on a typical model architecture proved to be inefficient for the high-resolution data requested by our tasks.

This process is unstable, the requirements for GPU memory are incredible, and forced to use small batches of data that would cripple the learning process.

We had to move from a popular approach to a simpler and more elegant architecture that would suit the particular goals of text generation from images and pivot. This willingness to question the hype and return to first principles has ultimately enabled us to achieve a technical breakthrough.

But this is the AI ​​story for most business press ends.

The reality is that our work had just begun. We have built a model that matches the accuracy of existing Optical Character Recognition (OCR) systems specialized for business cards. However, we soon came across a second, more cumbersome obstacle: the wall of practical applications.

The technical successes that engineers have learned are only half of the battle. Our new AI was powerful, but when run on expensive GPUs, its operating costs were significantly higher than in-house systems. To justify its existence, it had to provide a return on investment that was far above its cost.

Also, simply matching the performance of a deeply embedded legacy system is not a compelling reason to undertake alternative costs and dangerous processes. “AS AS AS GOOD” is a technical milestone, not a business case.

The developer's first attempt to deploy technology is stagnant. The business unit was approached passively and asked, “What can we do for you?” This was met with a mixture of confusion and discordant expectations. Breakthroughs only occurred when we reversed our strategy.

Instead of asking what we could do, our tech team began to show us. They took the department's own data, ran it through a model, and presented a qualitative and quantitative analysis of the results, i.e. concrete reports.

Suddenly the conversation changed. Abstract possibilities have now become concrete values. It was built not only by explaining the model's capabilities for specific problems in the model, but also by demonstrating the trust and understanding needed to secure resources and move towards deployment.

This journey from technological proof of concept to growing multi-domain business applications holds universal lessons. Actual AI dividends are not billed by companies that are the fastest to employ common tools.

It goes to people with discipline to identify their core business challenges and the courage to invest in bespoke solutions, even if it means building from scratch. There is a need to develop a culture where technical and business teams work very closely to speak a shared language.

The future of enterprise AI belongs not only to users, but also to builders. The final miles of innovation are always the most challenging, and those who understand that the hard work of bridging the gap between the promises of powerful technology and the complex reality of problems worthy of solving can provide a true competitive advantage.


Fukuda River He is the managing director of Sansan.

Tnglobal Insider Publish contributions related to entrepreneurship and innovation. You may submit your own original or published contributions that are subject to editorial discretion.

Featured Image: Juan Ordonez from Unsplash

APAC's next big security risk: AI agents promote identity debt



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *