Use the following context to answer the user's question.
If the question cannot be answered from the context, state that clearly.
Context:
{context}
Question:
{question}
Then I created a new one SpringAIRagService:
package com.infoworld.springaidemo.service;
import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;
import org.springframework.ai.chat.client.ChatClient;
import org.springframework.ai.chat.prompt.Prompt;
import org.springframework.ai.chat.prompt.PromptTemplate;
import org.springframework.ai.document.Document;
import org.springframework.ai.vectorstore.VectorStore;
import org.springframework.ai.vectorstore.SearchRequest;
import org.springframework.stereotype.Service;
@Service
public class SpringAIRagService {
@Value("classpath:/templates/rag-template.st")
private Resource promptTemplate;
private final ChatClient chatClient;
private final VectorStore vectorStore;
public SpringAIRagService(ChatClient.Builder chatClientBuilder, VectorStore vectorStore) {
this.chatClient = chatClientBuilder.build();
this.vectorStore = vectorStore;
}
public String query(String question) {
SearchRequest searchRequest = SearchRequest.builder()
.query(question)
.topK(2)
.build();
List similarDocuments = vectorStore.similaritySearch(searchRequest);
String context = similarDocuments.stream()
.map(Document::getText)
.collect(Collectors.joining("\n"));
Prompt prompt = new PromptTemplate(promptTemplate)
.create(Map.of("context", context, "question", question));
return chatClient.prompt(prompt)
.call()
.content();
}
}
of SpringAIRagService wire ChatClient.Builderused to build. ChatClientwith us VectorStore. of query() The method accepts the question and VectorStore Build context. first, SearchRequestThis is done as follows:
- call static
builder()method. - Pass the question as a query.
- use of
topK() method to specify how many documents we want to retrieve from the vector store. - call it
build()method.
In this case, we want to get the top two documents that are most similar to the question. In reality, I would use something bigger, like top 3 or top 5, but since I only have 3 documents, I limited myself to 2.
Next, in the vector store similaritySearch() Pass the method SearchRequest. of similaritySearch() The method uses a vector store embedding model to create a multidimensional vector of questions. It then compares that vector to each document and returns the document most similar to the question. Stream all similar documents, retrieve text and build context String.
Next, create a prompt that tells LLM to use the context to answer your question. Note that it is important to instruct the LLM to use the context to answer the question, and to indicate that the question cannot be answered from the context if it cannot do so. If you do not provide these instructions, LLM will use the data it was trained on to answer your questions. This means using information that is not in the context provided.
Finally, build the prompt by setting the context and question, ChatClient. added SpringAIRagController process POST Create requests and send them to SpringAIRagService:
package com.infoworld.springaidemo.web;
import com.infoworld.springaidemo.model.SpringAIQuestionRequest;
import com.infoworld.springaidemo.model.SpringAIQuestionResponse;
import com.infoworld.springaidemo.service.SpringAIRagService;
import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestBody;
import org.springframework.web.bind.annotation.RestController;
@RestController
public class SpringAIRagController {
private final SpringAIRagService springAIRagService;
public SpringAIRagController(SpringAIRagService springAIRagService) {
this.springAIRagService = springAIRagService;
}
@PostMapping("/springAIQuestion")
public ResponseEntity askAIQuestion(@RequestBody SpringAIQuestionRequest questionRequest) {
String answer = springAIRagService.query(questionRequest.question());
return ResponseEntity.ok(new SpringAIQuestionResponse(answer));
}
}
of askAIQuestion() The method is SpringAIQuestionRequestwhich is a Java record.
package com.infoworld.springaidemo.model;
public record SpringAIQuestionRequest(String question) {
}
of SpringAIQuestionRequest returns SpringAIQuestionResponse:
package com.infoworld.springaidemo.model;
public record SpringAIQuestionResponse(String answer) {
}
Now restart the application and POST to /springAIQuestion. In my case I sent the following request body:
{
"question": "Does Spring AI support RAG?"
}
And I received the following response:
{
"answer": "Yes. Spring AI explicitly supports Retrieval Augmented Generation (RAG), including chat memory, integrations with major vector stores, a portable vector store API with metadata filtering, and a document injection ETL framework to build RAG pipelines."
}
As you can see, LLM answered the question using the context of the document you loaded into the vector store. You can further test whether it follows the instructions by asking out-of-context questions.
{
"question": "Who created Java?"
}
The LLM response is:
{
"answer": "The provided context does not include information about who created Java."
}
This is an important validation to ensure that the LLM is only using the context provided to answer the question and not using training data or worse, trying to fabricate the answer.
conclusion
This article showed you how to use Spring AI to incorporate large-scale language model functionality into Spring-based applications. Configure LLM and other AI technologies using Spring standards application.yaml Create a file and connect it to your Spring component. Spring AI provides an abstraction for interacting with LLM, so you don’t need to use an LLM-specific SDK. For experienced Spring developers, this entire process is similar to how Spring Data uses Spring Data interfaces to abstract database interactions.
This example showed how to configure and use a large language model in a Spring MVC application. We concluded by configuring OpenAI to answer simple questions, introducing prompt templates to externalize LLM prompts, and implementing a simple RAG service in a sample application using a vector store.
Spring AI has a powerful feature set, and we’ve just scratched the surface of what you can do with it. We hope the examples in this article provide enough basic knowledge to help you start building AI applications using Spring. Once you’re comfortable configuring and accessing large language models in your applications, you can move on to more advanced AI programming, such as building AI agents to improve business processes.
Read next: The hidden skills behind AI engineers.
