Typical Text AI Application Workflow | Harish Siva Subramanian

Typical behind-the-scenes workflow for Text AI apps

Photo by Neeqolah Creative Works on Unsplash

This article describes what happens behind a typical Text AI application that simply answers questions based on PDFs. We all know about pdf.ai, where you can just upload a PDF, ask a question and get an answer. Have you ever wondered what's going on in the background?

Let's consider some of the options available to you.

Option 1

In the image above, the PDF is displayed at the bottom and the user's question is displayed at the top. All text is used here to answer the question. This may not be a good idea, as ChatGPT results are often incorrect.

ChatGPT also allowed you to send only a limited number of texts. Even if there is an option to send more chats, the model doesn't work very well with large chunks of data where a lot of text is passed at once.

Option 2

This option splits the uploaded PDF text into smaller chunks of text. They are then input into an embedding algorithm, converted to numbers, and stored in a vector database. Each of these chunks is now stored in the vector store. When a user types a question, it searches for the chunk that contains the most relevant information.

It then sends this chunk and the user's question to ChatGPT to get an answer to the question.

Here are some of the steps to do so.

Here the entire PDF text is extracted.

They are then split into smaller chunks of text. You can specify a chunk size that you think is appropriate, such as 1000 characters.

These chunks are passed to the embedding creation algorithm. Embedding takes a string and converts it into an array of numbers between -1 and 1.

Each score describes a different context. You can create text padding of any length.

You will always get different values for each statement.

Once we have the embedding, we save it to a Vector database.

When a user asks a question, the question is converted back into an embed.

This embedding is used to compare with the list of embeddings in the vector store to obtain the most similar embedding.

In this case, suppose the most relevant text is in embedded chunk #3.

It then passes the question along with the most relevant chunks to ChatGPT and returns the answer.

ChatGPT itself provides the answer.

Easily accomplish each of the tasks listed above using Langchain. Langchain has various methods under the hood that make it easy to perform each of the above tasks.

I'll save that for a future article.

thank you for reading!

Source link