Run AI Models Locally with Docker Model Runner and Java Integration
This article explains how Docker Model Runner enables effortless local execution of AI models, details platform support, provides a full command reference, shows how to use the REST endpoint, and demonstrates integration with Java via LangChain4j, including code examples and a feature comparison with Ollama.
Docker introduced the Model Runner feature in version 4.40, making it simple to run AI models locally without complex environment setup.
Current platform support: Docker Model Runner is available on Apple Silicon (M‑series) Macs, with Windows support planned for future releases.
The feature marks a significant step for Docker into AI development, allowing developers to manage and run large language models locally and avoid reliance on external cloud services.
Available Commands
Check Model Runner Status
Check whether Docker Model Runner is active:
docker model statusList All Commands
Show help information and available sub‑commands: docker model help Output:
Usage: docker model COMMAND
Commands:
list List locally available models
pull Download a model from Docker Hub
rm Remove a downloaded model
run Run a model interactively or with a prompt
status Check if the model runner is running
version Show the current versionPull a Model
Pull a model from Docker Hub to the local environment: docker model pull <model> Example: docker model pull ai/deepseek-r1-distill-llama Output:
Downloaded: 257.71 MB
Model ai/deepseek-r1-distill-llama pulled successfullyList Available Models
List all models currently pulled to the local environment: docker model list Sample output:
MODEL PARAMETERS QUANTIZATION ARCHITECTURE MODEL ID CREATED SIZE
ai/deepseek-r1-distill-llama 361.82 M IQ2_XXS/Q4_K_M llama 354bf30d0aa3 1 days ago 256.35 MiBRun a Model
Run a model with a single prompt or in interactive chat mode.
Single Prompt
docker model run ai/deepseek-r1-distill-llama "Hi"Output:
Hello! How can I assist you today?Interactive Chat
docker model run ai/deepseek-r1-distill-llamaOutput:
Interactive chat mode started. Type '/bye' to exit.
> Hi
Hi there! It's SmolLM, AI assistant. How can I help you today?
> /bye
Chat session ended.Delete a Model
docker model rm <model>Output:
Model <model> removed successfullyUsing the REST Endpoint
Enable host‑side TCP support in Docker Desktop GUI or CLI:
docker desktop enable model-runner --tcp <port>Then interact via the chosen port, for example:
curl http://localhost:12434/engines/llama.cpp/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "ai/deepseek-r1-distill-llama",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Please write a summary about Docker."}
]
}'LangChain4j Integration
LangChain4j is a Java framework for building applications powered by large language models (LLMs), offering a simple way for Java developers to interact with various LLMs.
Setup Steps
1. Ensure Docker Model Runner Is Enabled
Make sure the Model Runner feature is turned on in Docker Desktop.
2. Add LangChain4j Dependency
Add the following dependencies to your pom.xml:
<dependencies>
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j</artifactId>
<version>1.0.0-beta2</version>
</dependency>
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-open-ai</artifactId>
<version>1.0.0-beta2</version>
</dependency>
</dependencies>3. Pull and Run the Desired Model
docker model pull ai/deepseek-r1-distill-llama4. Configure LangChain4j to Connect to the Local Model
import dev.langchain4j.model.chat.ChatLanguageModel;
import dev.langchain4j.model.openai.OpenAiChatModel;
public class ModelConfig {
public ChatLanguageModel chatLanguageModel() {
return OpenAiChatModel.builder()
.baseUrl("http://localhost:12434/engines/llama.cpp/v1")
.modelName("ai/deepseek-r1-distill-llama")
.temperature(0.7)
.build();
}
}Sample Application
public class DockerModelExample {
interface Assistant {
String chat(String message);
}
public static void main(String[] args) {
ModelConfig config = new ModelConfig();
ChatLanguageModel model = config.chatLanguageModel();
Assistant assistant = AiServices.builder(Assistant.class)
.chatLanguageModel(model)
.build();
String response = assistant.chat("用 Java 编写一个简单的 Hello World 程序");
System.out.println(response);
}
}Summary
Docker Model Runner and Ollama both aim to simplify local AI model execution, but Docker Model Runner is tightly integrated with the Docker ecosystem, while Ollama is a standalone, cross‑platform tool with broader language support and more flexible model customization.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Java Architecture Diary
Committed to sharing original, high‑quality technical articles; no fluff or promotional content.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
