Generate Structured JSON with Ollama LLM Using Java

This guide explains why structured JSON output from LLMs is essential, walks through installing and running Ollama, and provides a complete Java Spring Boot implementation—including POJOs, service code, and best‑practice tips—to retrieve AI‑generated data in a reliable, parsable format.

Java Architecture Diary
Java Architecture Diary
Java Architecture Diary
Generate Structured JSON with Ollama LLM Using Java

Why Choose Structured Output?

In real‑world applications, integrating LLM responses into existing systems often requires a standard data format such as JSON, which is easy to parse, clearly structured, and simple to integrate.

Easy to parse : JSON is a standard data‑exchange format that maps directly to objects in most programming languages.

Clear structure : Compared with plain text, JSON’s hierarchy is more explicit.

Integration‑friendly : It can be readily consumed by existing workflows and services.

Technical Implementation

1. Environment Setup

First install and run Ollama locally.

macOS Installation

# Use Homebrew to install
brew install ollama

Linux Installation

# Use the official script
curl -fsSL https://ollama.com/install.sh | sh

Start Ollama Service

After installation, run:

# Start Ollama service
ollama serve

Pull and Run Model

Open a new terminal and pull the Qwen2.5 32B model:

# Pull Qwen2.5 32B model
ollama pull qwen2.5:32b

Test the model:

# Test model
ollama run qwen2.5:32b "你好,请做个自我介绍"

Verify API Service

Ensure the Ollama API is running:

# Test API service
curl http://localhost:11434/api/version

If version information is returned, the service is ready.

2. Java Code Implementation

We use Spring Boot to create a simple example. First, define the required POJOs:

@Data
@Builder
public class ChatMessage {
    private String role;
    private String content;
}

@Data
@Builder
public class ChatCompletionRequest {
    private String model;
    private List<ChatMessage> messages;
    private String format;
}

@Data
public class ChatCompletionResponse {
    private String model;
    private String createdAt;
    private ChatMessage message;
    private String done;
}

Next, implement the service that calls the Ollama API:

@Service
@Slf4j
public class ChatCompletionService {
    private static final String API_ENDPOINT = "http://localhost:11434/api/chat";
    private final RestTemplate restTemplate;

    public ChatCompletionService(RestTemplate restTemplate) {
        this.restTemplate = restTemplate;
    }

    public String generateStructuredResponse(String prompt) {
        ChatCompletionRequest request = ChatCompletionRequest.builder()
            .model("qwen2.5:32b")
            .messages(List.of(ChatMessage.builder()
                .role("user")
                .content(prompt)
                .build()))
            .format("json")
            .build();

        ResponseEntity<ChatCompletionResponse> response = restTemplate.postForEntity(
            API_ENDPOINT,
            request,
            ChatCompletionResponse.class
        );

        return Optional.ofNullable(response.getBody())
            .map(ChatCompletionResponse::getMessage)
            .map(ChatMessage::getContent)
            .orElse("");
    }
}

Practical Example

Suppose we want a JSON recommendation for the AITO M9 car model:

String prompt = "\
    请生成问界M9车型的推荐信息,返回JSON格式,结构如下:\
    {\
        \"model\": string,\
        \"brand\": string,\
        \"priceRange\": string,\
        \"powerType\": string,\
        \"scenarios\": string[],\
        \"advantages\": string[],\
        \"recommendation\": {\
            \"trim\": string,\
            \"color\": string,\
            \"options\": string[]\
        }\
    }\
    ";

String response = chatCompletionService.generateStructuredResponse(prompt);

Ollama returns a JSON similar to:

{
    "model": "问界M9",
    "brand": "问界AITO",
    "priceRange": "50-70万元",
    "powerType": "增程式混动",
    "scenarios": ["商务接待", "家庭出行", "长途旅行", "城市通勤"],
    "advantages": ["华为智能座舱", "超大空间", "豪华舒适", "智能驾驶", "低油耗"],
    "recommendation": {
        "trim": "旗舰版",
        "color": "星际银",
        "options": ["全自动泊车辅助", "行政座椅套件", "全景天幕"]
    }
}

Best Practices

Clear prompts : Explicitly specify the desired JSON structure and set format to json.

Error handling : Add appropriate error handling because LLM output may not always meet expectations.

Output validation : Use JSON Schema to verify the response format.

Performance optimization : Consider caching to avoid duplicate requests.

Conclusion

Ollama’s structured output capability provides a powerful way to embed AI functionality into existing systems. By using JSON, developers can easily process and consume AI‑generated content. The Java implementation shown here serves as a solid starting point for leveraging Ollama’s structured responses.

AILLMJSONOllamastructured output
Java Architecture Diary
Written by

Java Architecture Diary

Committed to sharing original, high‑quality technical articles; no fluff or promotional content.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.