Spring AI Structured Output API

LLMs generate natural language responses that are easily understood by humans. However, it is difficult for applications to communicate with each other using natural language. Therefore, these responses must be converted into machine-readable formats such as JSON, CSV, XML, or POJOs.

Some LLM providers including OpenAI, Google PaLM, and Anthropic(Claude) have specialized LLM models that can understand the output Schema and generate the response accordingly.

We'll learn how Spring AI helps integrate with them through its framework.

Prerequisites

For this article, we'll use OpenAI LLM services. Hence, we'll need an active OpenAI subscription to invoke its APIs.

Furthermore, we must import the Maven dependencies to integrate the Spring Boot application with OpenAI:

<dependency>
    <groupId>org.springframework.ai</groupId>
    <artifactId>spring-ai-starter-model-openai</artifactId>
    <version>1.0.0-M8</version>
</dependency>

Moreover, it's always advisable to use Spring Initializr to add the necessary dependencies. The details are also available in our repository.

Generate Output Format Using Prompt

The easiest way to obtain a response in a specific format is to prompt the LLM service to generate it in that format such as JSON, XML, or YAML.

Consider a use case where the user queries the LLM about books by a certain author. He/She expects the LLM to respond with the book's details in JSON format.

For this, let's write a basic program with the Spring AI framework:

void whenPromptToProvideJsonResponse_thenCheckResponse() {
    String query = "Can you suggest some popular works by the Author Mark Twain. "
      + "Please provide the response in the form of JSON Array. "
      + "Respond only with a valid JSON Array. "
      + "Do not include any additional text, comments or symbols."
      + "Each object in the JSON array has the following keys:"
      + "bookName, AuthorName, yearOfPublish.";
    Prompt prompt = new Prompt(query);

    ChatResponse chatResponse = chatModel.call(prompt);
    logger.info("The output from the LLM: {}", chatResponse.getResult()
      .getOutput()
      .getText());
}

The Spring AI framework injects the ChatModel bean. Then, we create the Prompt object with the precise instructions to generate the JSON response. Later, we send the prompt to the OpenAI LLM service by calling the ChatModel#call() method. Inherently, there are a few issues with this approach:

It is not a standard approach
Prompts are overly descriptive
To send the JSON to a downstream service, we'll first need to explicitly convert it to a POJO.
There is no way to prevent the LLM from hallucinating and generating a wrong response.
It doesn't use the LLM providers' specialized APIs supporting Structured Output.

We'll address these issues in the next section, but first, let's see how the LLM responds:

[
    {
        "bookName": "The Adventures of Tom Sawyer",
        "AuthorName": "Mark Twain",
        "yearOfPublish": 1876
    },
    {
        "bookName": "Adventures of Huckleberry Finn",
        "AuthorName": "Mark Twain",
        "yearOfPublish": 1884
    },
    {
        "bookName": "The Prince and the Pauper",
        "AuthorName": "Mark Twain",
        "yearOfPublish": 1881
    },
    and so on...
]

We get a JSON Array in response including all the attributes defined in the prompt. The LLM basically responded in JSON Mode. However, we must have a way to dictate LLMs adhere strictly to a JSON schema. That's possible if the LLM supports Structured Outputs.

Structured Output Converter

Spring AI has multiple implementation classes of the interface StructuredOutputConverter<T>. Before we move on further, let's see how applications utilize the structured output converter to communicate with others:

Bean Output Converter With Record Type

The BeanOutputConverter helps convert the JSON messages received from LLM directly into a POJO:

void whenUseBeanOutputConvertor_thenConvertToPojo() {
    StructuredOutputConverter<AuthorBooks> beanOutputConverter
      = new BeanOutputConverter(AuthorBooks.class);
    String userInputTemplate = """
      Can you suggest 6 popular works of the Author {author}.
      Please provide the title, author name and publishing date.
      {format}
      """;
    PromptTemplate promptTemplate
      = new PromptTemplate(userInputTemplate,
        Map.of("author","Mark Twain", "format", beanOutputConverter.getFormat()));

    Prompt prompt = new Prompt(promptTemplate.createMessage());

    ChatResponse chatResponse = chatModel.call(prompt);
    Generation generation = chatResponse.getResult();
    String responseContent = generation.getOutput().getText();
    logger.info("The output from the LLM: {}", responseContent);
    AuthorBooks authorBooks = beanOutputConverter.convert(responseContent);
    assertTrue(authorBooks.books().size() == 6);
}

The method instantiates an instance of BeanOutputConverter for a response message of type AuthorBooks record:

record AuthorBooks (String author, List<Book> books){}

Further, we create the PromptTemplate object using the userInputTemplate string, a Map object, and the AuthorBooks bean's JSON schema, wrapped inside the result of BeanOutputConverter#getFormat(). The Map object has the format and author value placeholders in the userInputTemplate string.

Later, we create the Prompt object using the PromptTemplate and finally call the ChatModel#call(Prompt) method to receive the response from the LLM:

{
  "author": "Mark Twain",
  "books": [
    {
      "author": "Mark Twain",
      "title": "The Adventures of Tom Sawyer",
      "yearOfPublish": "1876"
    },
    {
      "author": "Mark Twain",
      "title": "The Adventures of Huckleberry Finn",
      "yearOfPublish": "1884"
    },
    {
      "author": "Mark Twain",
      "title": "A Connecticut Yankee in King Arthur's Court",
      "yearOfPublish": "1889"
    },
and so on....

The BeanOutputConvertor object converts the JSON response to the AuthorBooks object. Consequentially, it is much easier to pass this object to downstream services for further processing.

Bean Output Converter With ParameterizedTypeReference

In the previous section, we created an additional wrapper record AuthorBooks. We can entirely avoid it by using the ParamerizedTypeReference class:

void whenUseBeanOutputConvertorWithParameterizedType_thenConvertToPojo() {
    StructuredOutputConverter<List<Book>> beanOutputConverter
      = new BeanOutputConverter(new ParameterizedTypeReference<List<Book>>() {
    });
    String userInputTemplate = """
      Can you suggest 6 popular works of the Author {author}.
      Please provide the title, author name and publishing date.
      {format}
      """;

    PromptTemplate promptTemplate
      = new PromptTemplate(userInputTemplate,
        Map.of("author","Mark Twain", "format", beanOutputConverter.getFormat()));

    Prompt prompt = new Prompt(promptTemplate.createMessage());

    ChatResponse chatResponse = chatModel.call(prompt);
    Generation generation = chatResponse.getResult();
    String responseContent = generation.getOutput().getText();
    logger.info("The output from the LLM: {}", responseContent);
    List<Book> books = beanOutputConverter.convert(responseContent);
    assertTrue(books.size() == 6);
}

In the example, we created a ParamterizedTypeReference object of type List<Book>. Then by using it, we instantiated the BeanOutputConverter object. The rest of the steps are similar to what we discussed in the previous section.

Map Output Converter

The MapOutputConverter class converts the LLM response to an object of Map<String, Object>. Also, the MapOutputConverter#getFormat() method helps specify the format of the LLM response.

Let's consider an example, where we'll fetch a map of the author's name to a list of books written by them:

void whenUseMapOutputConverter_thenConvertToPojo() {
    StructuredOutputConverter<Map<String, Object>> mapOutputConverter 
      = new MapOutputConverter();
    String userInputTemplate = """
      Can you suggest 2 popular books each, of any two authors from India.
      The books have to be mentioned against the author name as the key
      Please provide the title, author name and publishing date.
      {format}
      """;
    PromptTemplate promptTemplate
      = new PromptTemplate(userInputTemplate,
      Map.of("format", mapOutputConverter.getFormat()));

    Prompt prompt = new Prompt(promptTemplate.createMessage());
    ChatResponse chatResponse = chatModel.call(prompt);
    Generation generation = chatResponse.getResult();

    String responseContent = generation.getOutput().getText();
    logger.info("The output from the LLM: {}", responseContent);
    Map<String, Object> map = mapOutputConverter.convert(responseContent);
    
    assertTrue(map.keySet().size() == 2);
}

When the method executes, it displays the JSON response from the LLM:

{
  "Chetan Bhagat": [
    {
      "title": "Five Point Someone",
      "author": "Chetan Bhagat",
      "publishing_date": "2004"
    },
    {
      "title": "One Night @ the Call Center",
      "author": "Chetan Bhagat",
      "publishing_date": "2005"
    }
  ],
  "Arundhati Roy": [
    {
      "title": "The God of Small Things",
      "author": "Arundhati Roy",
      "publishing_date": "1997"
    },
    {
      "title": "The Ministry of Utmost Happiness",
      "author": "Arundhati Roy",
      "publishing_date": "2017"
    }
  ]
}

Later, we called mapOutputConverter#convert(responseContent) to fetch the Map<String, Object> object.

List Output Converter

Suppose, we just want to get the list of titles of books. Let's see how to do it:

void whenUseListOutputConvertor_thenConvertToPojo() {
    DefaultConversionService defaultConversionService = new DefaultConversionService();
    ListOutputConverter listOutputConverter
      = new ListOutputConverter(defaultConversionService);
    String userInputTemplate = """
        Can you suggest 6 popular works of the Author {author}.
        Please provide only the title.
        {format}
        """;

    PromptTemplate promptTemplate
      = new PromptTemplate(userInputTemplate,
        Map.of("author","Mark Twain", "format", listOutputConverter.getFormat()));

    Prompt prompt = new Prompt(promptTemplate.createMessage());

    ChatResponse chatResponse = chatModel.call(prompt);
    Generation generation = chatResponse.getResult();
    String responseContent = generation.getOutput().getText();
    logger.info("The output from the LLM: {}", responseContent);
    List<String> books = listOutputConverter.convert(responseContent);
    assertTrue(books.size() == 6);
}

When the method executes, then the LLM responds with a comma-delimited list of book titles:

The Adventures of Tom Sawyer, Adventures of Huckleberry Finn, 
A Connecticut Yankee in King Arthur's Court, The Prince and the Pauper,
The Innocents Abroad, Life on the Mississippi

We create the ListOutputConverter object with the DefaultConversionService class. Later, we use it to convert the comma-delimited outputs into a List object.

Conclusion

As more systems adopt LLMs, receiving responses in pre-defined formats has become essential. These formats enable seamless communication between systems, fostering reliable and innovative solutions.

As a result, Spring AI's support for Structured Output is undeniably emerging as a popular feature.

Visit our GitHub repository to access the article's source code.

Kode Sastra

Search this blog