Unlock AI Reasoning: How Ollama’s New ‘Thinking’ Feature Works
Version 0.9.0 of Ollama introduces a ‘thinking’ control that lets users view and manage the AI model’s reasoning process, with detailed CLI commands, REST API usage, model support list, scripting options, and advanced Modelfile configurations for models like DeepSeek R1 and Qwen 3.
Ollama v0.9.0 introduces a “thinking” control that allows users to control the AI model’s inference process, providing a new experience for AI application development.
What is the Thinking feature?
The thinking feature lets the AI model display its internal reasoning before giving the final answer, essentially “thinking out loud” so users can see step‑by‑step analysis.
Models that support Thinking
DeepSeek R1 – a powerful open‑source model
Qwen 3 – Alibaba’s multilingual large model
More models are being added gradually
CLI usage
Basic control commands
Enable thinking:
<code># Enable with parameter
ollama run deepseek-r1 --think "9.9和9.11哪个更大?"
# In interactive mode
/set think</code>Disable thinking:
<code># Disable with parameter
ollama run deepseek-r1 --think= false "快速计算10+23"
# In interactive mode
/set nothink</code>Scripting usage
To use a thinking model in a script but only see the final result, add the
--hidethinkingflag:
<code>ollama run deepseek-r1:8b --hidethinking "草莓这个词里有几个r?"</code>API call example
REST API
Ollama’s API fully supports the thinking feature via the
thinkparameter:
<code>curl http://localhost:11434/api/chat -d '
{
"model": "deepseek-r1",
"messages": [
{
"role": "user",
"content": "解释一下量子纠缠的原理"
}
],
"think": true,
"stream": false
}'
</code>Response format includes a
thinkingfield that contains the model’s step‑by‑step reasoning.
Java integration libraries
Current mainstream Java AI libraries do not yet support the thinking flag:
Spring AI – does not support the
thinkproperty
LangChain4j – does not support the
thinkproperty
For Java projects, it is recommended to call the Ollama REST API directly.
Modelfile advanced configuration
Seamless thinking chain control for Qwen 3
Qwen 3 can be instructed to skip deep reasoning by embedding the
/nothinkkeyword in the prompt. Using the Modelfile
TEMPLATEfeature, this can be added automatically without changing backend code.
<code>FROM qwen3:latest
TEMPLATE ""
...
{{- if eq .Role "user" }}
<|im_start|>user
/nothink {{ .Content }}<|im_end|> # change this line
{{ else if eq .Role "assistant" }}
<|im_start|>assistant
...
""
</code>Creating and using the model:
<code>ollama create qwen3-fast -f ./Modelfile
ollama run qwen3-fast "解释一下机器学习的基本概念"
</code>The created model automatically prefixes each user input with
/nothink, enabling a default fast‑response mode.
Reference resources
Ollama official blog: https://ollama.com/blog/thinking
Ollama documentation: https://ollama.com/docs
Java Architecture Diary
Committed to sharing original, high‑quality technical articles; no fluff or promotional content.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.