Data Infrastructure: AI Gold Rush Picks and Shovels

Applications of AI


Much has been written about AI in the past year. Advances brought by generative AI models such as GPT-4 and Stable Diffusion have ushered in a new era of technological innovation and launched the “AI Gold Rush”. As a result, companies are competing for a piece of a huge pie that Goldman Sachs estimates at over $150 billion in annual revenue.

Invest in pickaxes and shovels in the Gold Rush

During the gold rush of the late 19th century, it was clear that gold was in the ground, but the success of gold mining operations was highly uncertain. But one thing is certain. Pickaxes and shovels are in demand.

Similarly, the AI ​​gold rush leaves much uncertainty about what the AI ​​value chain will look like in just a few years. Will startups be able to compete successfully with their own novel models, or will they all use fine-tuned flavors of models built by OpenAI and leading CSPs? or can current leaders successfully incorporate AI into their products to stay on top?

1 view gallery

Itai InverItai Inver

Itai Inver.

(Katya Savina)

Whatever the future holds for AI, one thing is certain: the importance of data infrastructure in enabling this revolution—the “pick and shovel” of the AI ​​gold rush.

Data infrastructure is key to enabling AI at scale

AI models form the foundation of this recent advancement, but scaling AI requires a robust data foundation on which to train and effectively serve models.

This process includes collecting and storing raw data, transforming data and training models using computational power, and processing and ingesting data in real time for inference. Ultimately, transforming raw data into AI insights in production is complex and relies on the presence of a strong data infrastructure. Data engineering teams play a key role in enabling AI and are constantly being improved to handle rapidly growing data volumes, large models, and the need for real-time processing and movement of data. You have to rely on a set of tools.

Data infrastructure has changed over the last decade, regardless of AI. This is driven by a move to the cloud and a growing interest in analytics. This transformation has been a huge commercial success with Snowflake, Databricks, Confluent, Elastic, MongoDB, and more.

Today, we are at a moment when storage and computing limitations are greatly reduced thanks to the cloud. As a result, today’s major trends revolve around developing processes that make the data universe faster, more reliable, efficient and impactful. All the key ingredients for a successful AI deployment.

Key trends shaping your data infrastructure

The following areas are expected to play a key role in shaping the next-generation data infrastructure, and we highlight some of the promising Israeli startups in it.

Advances in data center hardware – As models and data grow in size and the amount of inference increases, faster computing power is required to keep the models viable from both a speed and cost standpoint. A new cohort of dedicated hardware accelerators for both ML and data transformation/query, including Neuroblade, NextSilicon and Speedata, are vying to challenge incumbent chip makers. Companies like Run:AI, on the other hand, are focused on virtualizing GPU clusters to make better use of existing resources.

In parallel, new software-defined storage architectures such as Vast Data (a Greenfield Partners portfolio company) and faster network and I/O technologies such as Silicon Photonics developed by DustPhotonics (a Greenfield Partners portfolio company) are Evolving to respond more quickly. Data transfer requirements.

Faster calculation engine – As AI becomes more prevalent, we expect to accelerate the convergence of cloud data warehouses and data lakes towards unified data lakehouse architectures. This architecture provides the flexibility to support a wide range of use cases and computational engines while maintaining the required structure.

Snowflake, Databricks, and CSP are the market leaders in this space, but there remains an opportunity for new players with computing engines optimized for specific tasks to gain further share, and new entrants face the challenge. I look forward to working with you.

Real-time data acquisition and processing – High-performance data acquisition and pipelines are required to enable low-latency model inference and feedback. There are some new tools available to accomplish this. This includes new categories of databases optimized for ML such as Pinecone’s vector database, fast caching enabled by Redis, DragonflyDB and others, and real-time datastores such as market leaders Pinot, Druid and Clickhouse. . Use Materialize, Israeli Epsio, streaming data pipelines, in-stream processing (such as popular open source projects Kafka/Confluent and Flink), and other approaches. Companies like these enable rapid movement of data, a key focus for optimizing model performance.

Improved data governance, security and observability – As the complexity of enterprise data infrastructures grows, and the number of sources that feed data into them and the number of users accessing them, data governance is becoming an increasingly important area of ​​focus for data engineers. Observability, cataloging, privacy, and security are just a few of the growing areas for enterprises to gain greater control and visibility of their data stack. It then spawned promising Israeli companies to lead this attack, including Monte Carlo, Illumex, Sentra, Cyera, and Dig Security.

The Evolving MLOps Landscape – MLOps can probably be considered a separate category from data infrastructure, but they are well worth mentioning in this context. As the connecting thread between the underlying data platform, model development, and model deployment, MLOps tools are used to streamline the model development and deployment process (just as DevOps does in traditional software development). increasing in importance. Following the rapid evolution of AI models in recent months, we expect this category to undergo significant changes, but in the coming years we recognize its importance and several such as Qwak, Aporia and Deci Great expectations are placed on Israeli start-ups.

We are experiencing historic technological change that will impact business and society more broadly. Given the phenomenal pace of innovation, we believe it’s too early to tell what AI will look like in the next few years, but a solid data foundation is key to enabling AI progress, People building AI applications in the same way pickaxes and shovels were essential during the gold rush.

Itay Inbar is a Senior Associate at Greenfield Partners.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *