Skip to main content

Spring AI Structured Output API

Photo by Shubham Dhage on Unsplash
Photo by Shubham Dhage on Unsplash

LLMs generate natural language responses that are easily understood by humans. However, it is difficult for applications to communicate with each other using natural language. Therefore, these responses must be converted into machine-readable formats such as JSON, CSV, XML, or POJOs.

Some LLM providers including OpenAI, Google PaLM, and Anthropic(Claude) have specialized LLM models that can understand the output Schema and generate the response accordingly.

We'll learn how Spring AI helps integrate with them through its framework.

Prerequisites

For this article, we'll use OpenAI LLM services. Hence, we'll need an active OpenAI subscription to invoke its APIs.

Furthermore, we must import the Maven dependencies to integrate the Spring Boot application with OpenAI:

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-openai-spring-boot-starter</artifactId>
    <version>1.0.0-SNAPSHOT</version>
</dependency>

Moreover, it's always advisable to use Spring Initializr to add the necessary dependencies. The details are also available in our repository.

Generate Output Format Using Prompt

The easiest way to obtain a response in a specific format is to prompt the LLM service to generate it in that format such as JSON, XML, or YAML.

Consider a use case where the user queries the LLM about books by a certain author. He/She expects the LLM to respond with the book's details in JSON format.

For this, let's write a basic program with the Spring AI framework:

void whenPromptToProvideJsonResponse_thenCheckResponse() {
    String query = "Can you suggest some popular works by the Author Mark Twain. "
      + "Please provide the response in the form of JSON Array. "
      + "Respond only with a valid JSON Array. "
      + "Do not include any additional text, comments or symbols."
      + "Each object in the JSON array has the following keys:"
      + "bookName, AuthorName, yearOfPublish.";
    Prompt prompt = new Prompt(query);

    ChatResponse chatResponse = chatModel.call(prompt);
    logger.info("The output from the LLM: {}", chatResponse.getResult()
      .getOutput()
      .getContent());
}

The Spring AI framework injects the ChatModel bean. Then, we create the Prompt object with the precise instructions to generate the JSON response. Later, we send the prompt to the OpenAI LLM service by calling the ChatModel#call() method. Inherently, there are a few issues with this approach:

  • It is not a standard approach
  • Prompts are overly descriptive
  • To send the JSON to a downstream service, we'll first need to explicitly convert it to a POJO.
  • There is no way to prevent the LLM from hallucinating and generating a wrong response.
  • It doesn't use the LLM providers' specialized APIs supporting Structured Output.

We'll address these issues in the next section, but first, let's see how the LLM responds:

[
    {
        "bookName": "The Adventures of Tom Sawyer",
        "AuthorName": "Mark Twain",
        "yearOfPublish": 1876
    },
    {
        "bookName": "Adventures of Huckleberry Finn",
        "AuthorName": "Mark Twain",
        "yearOfPublish": 1884
    },
    {
        "bookName": "The Prince and the Pauper",
        "AuthorName": "Mark Twain",
        "yearOfPublish": 1881
    },
    and so on...
]

We get a JSON Array in response including all the attributes defined in the prompt. The LLM basically responded in JSON Mode. However, we must have a way to dictate LLMs adhere strictly to a JSON schema. That's possible if the LLM supports Structured Outputs.

Structured Output Converter

Spring AI has multiple implementation classes of the interface StructuredOutputConverter<T>. Before we move on further, let's see how applications utilize the structured output converter to communicate with others:

Structured Output Converter


Bean Output Converter With Record Type

The BeanOutputConverter helps convert the JSON messages received from LLM directly into a POJO:

void whenUseBeanOutputConvertor_thenConvertToPojo() {
    StructuredOutputConverter<AuthorBooks> beanOutputConverter
      = new BeanOutputConverter(AuthorBooks.class);
    String userInputTemplate = """
      Can you suggest 6 popular works of the Author {author}.
      Please provide the title, author name and publishing date.
      {format}
      """;
    PromptTemplate promptTemplate
      = new PromptTemplate(userInputTemplate,
        Map.of("author","Mark Twain", "format", beanOutputConverter.getFormat()));

    Prompt prompt = new Prompt(promptTemplate.createMessage());

    ChatResponse chatResponse = chatModel.call(prompt);
    Generation generation = chatResponse.getResult();
    String responseContent = generation.getOutput().getContent();
    logger.info("The output from the LLM: {}", responseContent);
    AuthorBooks authorBooks = beanOutputConverter.convert(responseContent);
    assertTrue(authorBooks.books().size() == 6);
}

The method instantiates an instance of BeanOutputConverter for a response message of type AuthorBooks record:

record AuthorBooks (String author, List<Book> books){}

Further, we create the PromptTemplate object using the userInputTemplate string, a Map object, and the AuthorBooks bean's JSON schema, wrapped inside the result of BeanOutputConverter#getFormat(). The Map object has the format and author value placeholders in the userInputTemplate string.

Later, we create the Prompt object using the PromptTemplate and finally call the ChatModel#call(Prompt) method to receive the response from the LLM:

{
  "author": "Mark Twain",
  "books": [
    {
      "author": "Mark Twain",
      "title": "The Adventures of Tom Sawyer",
      "yearOfPublish": "1876"
    },
    {
      "author": "Mark Twain",
      "title": "The Adventures of Huckleberry Finn",
      "yearOfPublish": "1884"
    },
    {
      "author": "Mark Twain",
      "title": "A Connecticut Yankee in King Arthur's Court",
      "yearOfPublish": "1889"
    },
and so on....

The BeanOutputConvertor object converts the JSON response to the AuthorBooks object. Consequentially, it is much easier to pass this object to downstream services for further processing.

Bean Output Converter With ParameterizedTypeReference

In the previous section, we created an additional wrapper record AuthorBooks. We can entirely avoid it by using the ParamerizedTypeReference class:

void whenUseBeanOutputConvertorWithParameterizedType_thenConvertToPojo() {
    StructuredOutputConverter<List<Book>> beanOutputConverter
      = new BeanOutputConverter(new ParameterizedTypeReference<List<Book>>() {
    });
    String userInputTemplate = """
      Can you suggest 6 popular works of the Author {author}.
      Please provide the title, author name and publishing date.
      {format}
      """;

    PromptTemplate promptTemplate
      = new PromptTemplate(userInputTemplate,
        Map.of("author","Mark Twain", "format", beanOutputConverter.getFormat()));

    Prompt prompt = new Prompt(promptTemplate.createMessage());

    ChatResponse chatResponse = chatModel.call(prompt);
    Generation generation = chatResponse.getResult();
    String responseContent = generation.getOutput().getContent();
    logger.info("The output from the LLM: {}", responseContent);
    List<Book> books = beanOutputConverter.convert(responseContent);
    assertTrue(books.size() == 6);
}

In the example, we created a ParamterizedTypeReference object of type List<Book>. Then by using it, we instantiated the BeanOutputConverter object. The rest of the steps are similar to what we discussed in the previous section.

Map Output Converter

The MapOutputConverter class converts the LLM response to an object of Map<String, Object>. Also, the MapOutputConverter#getFormat() method helps specify the format of the LLM response.

Let's consider an example, where we'll fetch a map of the author's name to a list of books written by them:

void whenUseMapOutputConverter_thenConvertToPojo() {
    StructuredOutputConverter<Map<String, Object>> mapOutputConverter 
      = new MapOutputConverter();
    String userInputTemplate = """
      Can you suggest 2 popular books each, of any two authors from India.
      The books have to be mentioned against the author name as the key
      Please provide the title, author name and publishing date.
      {format}
      """;
    PromptTemplate promptTemplate
      = new PromptTemplate(userInputTemplate,
      Map.of("format", mapOutputConverter.getFormat()));

    Prompt prompt = new Prompt(promptTemplate.createMessage());
    ChatResponse chatResponse = chatModel.call(prompt);
    Generation generation = chatResponse.getResult();

    String responseContent = generation.getOutput().getContent();
    logger.info("The output from the LLM: {}", responseContent);
    Map<String, Object> map = mapOutputConverter.convert(responseContent);
    
    assertTrue(map.keySet().size() == 2);
}

When the method executes, it displays the JSON response from the LLM:

{
  "Chetan Bhagat": [
    {
      "title": "Five Point Someone",
      "author": "Chetan Bhagat",
      "publishing_date": "2004"
    },
    {
      "title": "One Night @ the Call Center",
      "author": "Chetan Bhagat",
      "publishing_date": "2005"
    }
  ],
  "Arundhati Roy": [
    {
      "title": "The God of Small Things",
      "author": "Arundhati Roy",
      "publishing_date": "1997"
    },
    {
      "title": "The Ministry of Utmost Happiness",
      "author": "Arundhati Roy",
      "publishing_date": "2017"
    }
  ]
}

Later, we called mapOutputConverter#convert(responseContent) to fetch the Map<String, Object> object.

List Output Converter

Suppose, we just want to get the list of titles of books. Let's see how to do it:

void whenUseListOutputConvertor_thenConvertToPojo() {
    DefaultConversionService defaultConversionService = new DefaultConversionService();
    ListOutputConverter listOutputConverter
      = new ListOutputConverter(defaultConversionService);
    String userInputTemplate = """
        Can you suggest 6 popular works of the Author {author}.
        Please provide only the title.
        {format}
        """;

    PromptTemplate promptTemplate
      = new PromptTemplate(userInputTemplate,
        Map.of("author","Mark Twain", "format", listOutputConverter.getFormat()));

    Prompt prompt = new Prompt(promptTemplate.createMessage());

    ChatResponse chatResponse = chatModel.call(prompt);
    Generation generation = chatResponse.getResult();
    String responseContent = generation.getOutput().getContent();
    logger.info("The output from the LLM: {}", responseContent);
    List<String> books = listOutputConverter.convert(responseContent);
    assertTrue(books.size() == 6);
}

When the method executes, then the LLM responds with a comma-delimited list of book titles:

The Adventures of Tom Sawyer, Adventures of Huckleberry Finn, 
A Connecticut Yankee in King Arthur's Court, The Prince and the Pauper,
The Innocents Abroad, Life on the Mississippi

We create the ListOutputConverter object with the DefaultConversionService class. Later, we use it to convert the comma-delimited outputs into a List object.

Conclusion

As more systems adopt LLMs, receiving responses in pre-defined formats has become essential. These formats enable seamless communication between systems, fostering reliable and innovative solutions.

As a result, Spring AI's support for Structured Output is undeniably emerging as a popular feature.

Visit our GitHub repository to access the article's source code.

Comments

Popular posts from Kode Sastra

Qdrant DB - Spring AI Integration

Designed by Freepik This tutorial covers Spring AI's integration with Qdrant DB . It's an open-source, efficient, and scalable vector database. We'll insert some unstructured data into the vector DB. Then, we'll perform query and delete operations on the DB using the Spring AI framework. Brief Introduction to Qdrant DB It's a highly scalable multi-dimensional vector database with multiple flexible deployment options: Qdrant Cloud offers 100% managed SaaS on AWS, Azure, and GCP and a hybrid cloud variant on the Kubernetes cluster. It provides a unified console, to help create, manage, and monitor multi-node Qdrant DB clusters. It also supports on-premise private cloud deployments. This is for customers who want more control over management and data. Moreover, IAC tools like Terraform and Pulumi enable automated deployment and managemen...

Implement Rag with Spring AI and Qdrant DB

Designed by Freepik Earlier, we discussed Spring AI's integration with Qdrant DB . Continuing on the same lines, we'll explore and try implementing the Retrieval Augmented Generation (RAG) technique using Spring AI and Qdrant DB. We'll develop a chatbot that helps users query PDF documents, in natural language . RAG Technique Several LLMs exist, including OpenAI's GPT and Meta's Llama series, all pre-trained on publicly available internet data. However, they can't be used directly in a private enterprise's context because of the access restrictions to its knowledge base. Moreover, fine-tuning the LLMs is a time-consuming and resource-intensive process. Hence, augmenting the query or prompts with the information from the private knowledge base is the quickest and easiest way out . The application converts the user query into vectors. Then, it fires the q...

Building AI Assistance Using Spring AI's Function Calling API

Photo by Alex Knight on Unsplash Building AI assistance in existing legacy applications is gaining a lot of momentum. An AI assistant like a chatbot can provide users with a unified experience and enable them to perform functionalities across multiple modules through a single interface. In our article, we'll see how to leverage Spring AI to build an AI assistant. We'll demonstrate how to seamlessly reuse existing application services and functions alongside LLM capabilities. Function Calling Concept An LLM can respond to an application request in multiple ways: LLM responds from its training data LLM looks for the information provided in the prompt to respond to the query LLM has a callback function information in the prompt, that can help get the response Let's try to understand the third option, Spring AI's Function calling ...