A Billion Dollar Database for Generative AI | By Devansh – Machine Learning Made Simple | Jun 2023

Machine Learning


How Vector Databases Enable Generative AI

Devansh - Machine Learning Made Simple
data driven investor

Join over 34,000 AI people in touch with the most important ideas in machine learning through my free newsletter.

This is a crazy statistic – the funding raised by the Vector Database startup is $350 million Enable the next generation of AI products.

This can be a headache. After all, Vector DB is not as mainstream as other types of databases. Many in the software industry have never even heard of them. So why is the whole field buzzing about this when an investor put his $100 million into just one startup (his Pinecone at a 750 million valuation)? is it?

Today’s edition of Tech Made Simple covers why Vector DB is taking over the headlines in the database arena. We’ll explore what Vector DB is, how it works, and why it works so well in synergy with AI.

sauce

Vector databases have many applications across a variety of domains and applications, including natural language processing (NLP), computer vision (CV), recommendation systems (RS), and other areas requiring semantic understanding and matching of data. There is a usage example.

-Microsoft

W.The hat is Vector DB – Simply put, Vector DB stores Vectors (shocking). It may seem like a small thing at first, but it’s more powerful than you think. If you want a full definition, here’s an excerpt from Microsoft’s article on the topic: A vector database is a type of database that stores data as follows: A high-dimensional vector. A mathematical representation of a characteristic or attribute.. … Vectors are usually produced by applying some transformation or embedding function to raw data such as text, images, audio, video, etc. Embedding functions can be based on various methods such as machine learning models, word embeddings, and feature extraction algorithms..

To answer this question, let’s first understand how Gen AI works. To greatly simplify complex systems, Gen AI models like ChatGPT work in three steps.

  1. The dataset is encoded into the latent space (these are called embeddings).
  2. The latent space is used for training data.
  3. ChatGPT uses this encoded space to process the query. Queries are fed into the latent space. AI traverses the latent space to find the optimal output.

This is where Vector DB can prove very useful. The vectors stored in the Vector DB are not random values, but latent spatial embeddings of data (we see a bigger expansion here than in Game of Thrones). Vector DB specializes in processing Vector, which makes it the KDB to his Haaland for LLM and other giant Gen AI models.

H.The vector database works – Here is a Vector DB crash course for dummies –

  1. To use Vector DB, you need a Vector to insert. These vectors are generated by creating a vector embedding of the data indexing into the DB using AI. The AI ​​used is called the Embedding Model (EM).
  2. Vector embeddings are inserted into the vector database. In general, you’ll want to keep a reference to the original content where the embed was created in order to make the embed stand out and improve performance when searching the DB.
  3. When the application executes a query, it uses the same EM to create query embeddings and uses those embeddings to query the database. resemble Vector embedding. When it comes to Gen AI like ChatGPT, we add another layer to this. The model uses these similarity calculations to calculate the most likely next word. This, in essence, is also why ChatGPT is hallucinating, and ends up choosing words and sentences that may not actually be true. Note that this has nothing to do with old or inaccurate data in the sample (according to some claims). This basically has to do with architecture. If you want to learn more about why LLM causes hallucinations, our companion publication, AI Made Simple, explains it in detail.

Ultimately, Vector DB adds a level of flexibility not found in traditional databases. One of the things I learned quickly when I worked on converting English statements (written by business users) into the SQL queries that needed to be executed (which might join multiple tables) was that AI is that there is a limit. Instead, it works somewhat by using relatively basic AI (compared to the giant AIs we see these days) and focusing all our efforts on reconstructing the dataset in a way that makes it easier for the AI ​​to interact with it. I was able to build a prototype that Vector DB takes that principle and raises millions of dollars to bring it up to 11.

That’s it for this piece. Thank you for your time. As always, if you’re interested in working with me or checking out my other work, my link is at the end of this email/post. If you like my writing, I’d appreciate it if you could send me an anonymous review. You can drop it here. If you find this article valuable, please share it with more people. Word-of-mouth referrals like yours help me grow.

Save time, energy and money by reviewing all these videos, courses, products and “coaches” and easily find the one that meets all your needs in one place: “Technology made simple”! Leverage expert insights, tips, and resources to stay ahead of the AI, software engineering, and technology industries. New subscribers get 20% off when they click this link. Subscribe now and simplify your approach to technology!

Using this discount will lower the price –

800 INR (10 USD) → 640 INR (8 USD)/month

8000 INR (100 USD) → 6400 INR (80 USD)/year (533 INR/month)

20% off for 1 year

Use the links below to check out my other content, learn more about tutoring, contact me about a project, or just say hello.

Click here for short snippets on technology, AI and machine learning

AI Newsletter – https://artificialintelligencemadesimple.substack.com/

My grandmother’s favorite tech newsletter – https://codinginterviewsmadesimple.substack.com/

Check out my other articles on Medium. : https://rb.gy/zn1aiu

My YouTube: https://rb.gy/88iwdd

Get in touch with me on LinkedIn. Let’s connect: https://rb.gy/m5ok2y

My Instagram: https://rb.gy/gmvuy9

my twitter: https://twitter.com/Machine01776819





Source link

Leave a Reply

Your email address will not be published. Required fields are marked *