The company accelerates AI development by providing new open source tools for large-scale data curation and model evaluation.
Iterative is a company dedicated to streamlining workflow. artificial intelligence (AI) engineer and creator of open source projects widely used in MLOps, announced on November 14, 2018 the release of DataChain, a new open source tool for processing and evaluating unstructured data.
Read also: AiThority Interview: Nicole Janssen, Co-Founder and Co-CEO of AltaML
According to McKinsey's global State of AI survey, published in early 2024, only 15% of surveyed companies have realized a meaningful impact of GenAI (GenAI) on their business so far. A large part of the problem lies in the challenge of processing large amounts of unstructured data, which was traditionally cumbersome, and predicting outcomes. This is due to the missing link between structured data technologies and new Python-based AI workflows. While (old) analytical databases allowed full control over data quality, unstructured multimodal data such as text and images proved to be much more difficult to assess and improve at scale.
“Today, the biggest challenge to adopting artificial intelligence in the enterprise is the lack of data curation and generative AI evaluation practices and tools that can ensure the quality of the results,” said Dmitry Petrov, CEO of Iterative. AI Model We need something that can evaluate and improve AI models. So far, this has only been done at the forefront of industry – look at DeepMind's AlphaGo training against itself, or OpenAI's DALL-E3 curating its own dataset. Our goal is to change this.”
The widespread availability of sophisticated AI-based models has paved the way for intelligent curation and processing of data. However, the technical barriers remain high due to the lack of an easy solution to process unstructured data with AI models in a manageable format. In practice, most AI engineers continue to build custom code to transform JSON model responses, fit databases, and run models in parallel with out-of-memory data.
Also read: Proactive ways to improve your AI skills
DataChain democratizes popular AI-based analytics capabilities such as “Large-Scale Language Model (LLM) Decision-making for LLM” and multi-modal GenAI assessment, significantly leveling the playing field for data curation and pre-processing. DataChain can also store and structure Python object responses using the latest data model schemas used by leading LLM and AI foundational model providers.
Founded in 2018, Iterative develops development tools for AI engineers. The company's open source software DVC has been downloaded over 20 million times and has over 18,000 stars on GitHub. Iterative currently has 400+ contributors across its various tools and 20+ enterprise SaaS customers, including F500 companies like UBS. Iterative is backed by True Ventures, Afore Capital, and 468 Capital.
Read also: Red teamwork is key to successful integration and application of AI
[To share your insights with us as part of editorial or sponsored content, please write to psen@itechseries.com]
