Artificial Intelligence 12 min read

How to Build a Resilient Multi‑LLM Chatbot with Spring AI

This tutorial demonstrates how to integrate multiple large language models from different providers into a Spring Boot application using Spring AI, configure primary, secondary, and tertiary models, and implement a fallback mechanism with Spring Retry to ensure high availability of the chatbot.

Programmer DD

Oct 10, 2025

How to Build a Resilient Multi‑LLM Chatbot with Spring AI

1. Overview

Modern applications increasingly embed large language models (LLMs) to add intelligent features. While a single LLM can handle many tasks, relying on only one model is not always optimal because different models excel at different kinds of work. This article shows how to use Spring AI to integrate multiple LLMs into a Spring Boot application, configure models from different vendors as well as multiple models from the same vendor, and build a resilient chatbot that automatically switches models when failures occur.

2. Configuring LLMs from Different Vendors

We first configure two LLMs, one from OpenAI and one from Anthropic.

2.1 Configure the Primary LLM

Add the OpenAI starter dependency to pom.xml:

<dependency>
  <groupId>org.springframework.ai</groupId>
  <artifactId>spring-ai-starter-model-openai</artifactId>
  <version>1.0.2</version>
</dependency>

Then configure the API key and model in application.yaml:

spring:
  ai:
    open-ai:
      api-key: ${OPENAI_API_KEY}
      chat:
        options:
          model: ${PRIMARY_LLM}
          temperature: 1

Spring AI automatically creates an OpenAiChatModel bean, which we expose as a ChatClient bean marked @Primary:

@Configuration
class ChatbotConfiguration {
  @Bean
  @Primary
  ChatClient primaryChatClient(OpenAiChatModel chatModel) {
    return ChatClient.create(chatModel);
  }
}

2.2 Configure the Secondary LLM

Add the Anthropic starter dependency to pom.xml:

<dependency>
  <groupId>org.springframework.ai</groupId>
  <artifactId>spring-ai-starter-model-anthropic</artifactId>
  <version>1.0.2</version>
</dependency>

Configure its API key and model in application.yaml:

spring:
  ai:
    anthropic:
      api-key: ${ANTHROPIC_API_KEY}
      chat:
        options:
          model: ${SECONDARY_LLM}

Create a dedicated ChatClient bean for the secondary model:

@Bean
ChatClient secondaryChatClient(AnthropicChatModel chatModel) {
  return ChatClient.create(chatModel);
}

3. Configuring Multiple LLMs from the Same Vendor

Spring AI creates only one ChatModel bean per vendor, so we must define additional beans manually. We add a custom property spring.ai.anthropic.chat.options.tertiary-model to hold the third model name and then define a custom ChatModel and its corresponding ChatClient:

@Bean
ChatModel tertiaryChatModel(AnthropicApi anthropicApi,
                           AnthropicChatModel anthropicChatModel,
                           @Value("${spring.ai.anthropic.chat.options.tertiary-model}") String tertiaryModelName) {
  AnthropicChatOptions chatOptions = anthropicChatModel.getDefaultOptions().copy();
  chatOptions.setModel(tertiaryModelName);
  return AnthropicChatModel.builder()
      .anthropicApi(anthropicApi)
      .defaultOptions(chatOptions)
      .build();
}

@Bean
ChatClient tertiaryChatClient(@Qualifier("tertiaryChatModel") ChatModel tertiaryChatModel) {
  return ChatClient.create(tertiaryChatModel);
}

4. Practical Use‑Case: A Resilient Chatbot

4.1 Build the Chatbot Service

We create a ChatbotService that injects the three ChatClient beans. The primary chat method is annotated with @Retryable to retry up to three times. If all retries fail, a @Recover method falls back to the secondary client, and if that also fails, it uses the tertiary client:

@Retryable(retryFor = Exception.class, maxAttempts = 3)
String chat(String prompt) {
  logger.debug("Attempting to process prompt '{}' with primary LLM. Attempt #{}", prompt,
               RetrySynchronizationManager.getContext().getRetryCount() + 1);
  return primaryChatClient.prompt(prompt).call().content();
}

@Recover
String chat(Exception exception, String prompt) {
  logger.warn("Primary LLM failure: {}", exception.getMessage());
  try {
    return secondaryChatClient.prompt(prompt).call().content();
  } catch (Exception e) {
    logger.warn("Secondary LLM failure: {}", e.getMessage());
    return tertiaryChatClient.prompt(prompt).call().content();
  }
}

4.2 Expose a REST API

We add a controller that receives a POST request at /api/chatbot/chat, forwards the prompt to the service, and returns the response:

@PostMapping("/api/chatbot/chat")
ChatResponse chat(@RequestBody ChatRequest request) {
  String response = chatbotService.chat(request.prompt);
  return new ChatResponse(response);
}

record ChatRequest(String prompt) {}
record ChatResponse(String response) {}

4.3 Test the Fallback Logic

Run the application with invalid model names for the primary and secondary LLMs and a valid model for the tertiary LLM:

OPENAI_API_KEY=... \
ANTHROPIC_API_KEY=... \
PRIMARY_LLM=gpt-100 \
SECONDARY_LLM=claude-opus-200 \
TERTIARY_LLM=claude-3-haiku-20240307 \
mvn spring-boot:run

Send a request with HTTPie:

http POST :8080/api/chatbot/chat prompt="What is the capital of France?"

The response shows the correct answer, confirming that the system fell back to the third LLM after the first two failed. Log output demonstrates the retry attempts and the cascade from primary to secondary to tertiary model.

5. Conclusion

This article explored how to integrate multiple LLMs into a single Spring AI application. We first showed how Spring AI abstracts model configuration for different vendors such as OpenAI and Anthropic. Then we addressed the more complex scenario of configuring several models from the same vendor by defining custom beans. Finally, we built a high‑availability chatbot that uses Spring Retry to automatically switch between models when failures occur, ensuring continuous service.

Java LLM Spring Boot Spring AI Resilience

Written by

Programmer DD

A tinkering programmer and author of "Spring Cloud Microservices in Action"

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.