Skip to main content

Posts

Spring AI's Transcription API

Photo by Elijah Merrell on Unsplash Some large language models (LLMs) can transcribe audio into text. Different businesses are rapidly adopting this technology and reaping productivity benefits. We've seen glimpses of this technology in Zoom, Microsoft Teams, and other collaboration and communication tools, where call transcriptions can be generated automatically. Further, the entertainment industry is rapidly adopting it in Movies, Advertisements, and other fields. Spring AI aims to provide a unified transcription API that integrates with LLM providers like OpenAI and Azure OpenAI . This article will explore how the Transcription API can be used to transcribe an audio file using OpenAI. Transcription API Key Classes We'll start learning some important Spring AI Transcription API classes: We can divide the components into two groups, one specific to the underlying LLM service provider and the other...

Spring AI Image Model API

Generated by AI Large Language Models (LLMs) such as ChatGPT-4, Claude 2, Llama 2, etc., generate text outputs when invoked with text prompts. However, Image Models such as DALL-E, MidJourney, and Stable Diffusion can generate images from user prompts, which can be either texts or images . All Model providers have their own APIs, and switching between them becomes challenging without a common interface. Luckily, the Spring AI library offers a framework that can help seamlessly integrate with the underlying models. In this article, we'll learn some important components of Spring AI's Image Model API and implement a demo program to generate an image with a prompt. Important Components of Image Model API First, let's look at the important components of the Image Model API that help integrate with the underlying LLM providers such as OpenAI, and Stability AI: Interfaces such as ImageOptions and ImageModel , as well as classes su...

Spring AI Advisors API

Designed by Freepik Most enterprise applications have to deal with cross-cutting concerns like governance, compliance, auditing, and security. That’s where the interceptor pattern addresses these challenges in most development frameworks. Spring AI Advisors specifically benefit AI applications that interact with Large Language Models (LLMs). These advisors act as interceptors for LLM requests and responses. They can intercept outgoing requests to LLM services, modify them, and then forward them to the LLM. Similarly, they can intercept incoming LLM responses, process them, and then deliver them to the application. The advisors can be useful for numerous use cases: Sanitize and validate data before sending them to LLMs Monitor and control LLM usage costs Auditing and logging LLM requests and responses adhere to governance and compliance requirements as part of the respo...

Spring AI Structured Output API

Photo by Shubham Dhage on Unsplash LLMs generate natural language responses that are easily understood by humans. However, it is difficult for applications to communicate with each other using natural language. Therefore, these responses must be converted into machine-readable formats such as JSON, CSV, XML, or POJOs. Some LLM providers including OpenAI, Google PaLM, and Anthropic(Claude) have specialized LLM models that can understand the output Schema and generate the response accordingly. We'll learn how Spring AI helps integrate with them through its framework. Prerequisites For this article, we'll use OpenAI LLM services. Hence, we'll need an active OpenAI subscription to invoke its APIs. Furthermore, we must import the Maven dependencies to integrate the Spring Boot application with OpenAI: <dependency> <groupId>org.springframework.ai</groupId> <artifactId>spring-ai-openai-spring-boot-starter</artif...

Building AI Assistance Using Spring AI's Function Calling API

Photo by Alex Knight on Unsplash Building AI assistance in existing legacy applications is gaining a lot of momentum. An AI assistant like a chatbot can provide users with a unified experience and enable them to perform functionalities across multiple modules through a single interface. In our article, we'll see how to leverage Spring AI to build an AI assistant. We'll demonstrate how to seamlessly reuse existing application services and functions alongside LLM capabilities. Function Calling Concept An LLM can respond to an application request in multiple ways: LLM responds from its training data LLM looks for the information provided in the prompt to respond to the query LLM has a callback function information in the prompt, that can help get the response Let's try to understand the third option, Spring AI's Function calling ...

Implement Rag with Spring AI and Qdrant DB

Designed by Freepik Earlier, we discussed Spring AI's integration with Qdrant DB . Continuing on the same lines, we'll explore and try implementing the Retrieval Augmented Generation (RAG) technique using Spring AI and Qdrant DB. We'll develop a chatbot that helps users query PDF documents, in natural language . RAG Technique Several LLMs exist, including OpenAI's GPT and Meta's Llama series, all pre-trained on publicly available internet data. However, they can't be used directly in a private enterprise's context because of the access restrictions to its knowledge base. Moreover, fine-tuning the LLMs is a time-consuming and resource-intensive process. Hence, augmenting the query or prompts with the information from the private knowledge base is the quickest and easiest way out . The application converts the user query into vectors. Then, it fires the q...