Artificial Intelligence 25 min read

How to Replicate DeepSeek‑R1’s Thought Process on Claude 3.5 Sonnet with Prompt Engineering

The article explains how to use prompt‑engineering techniques on Claude 3.5 Sonnet to mimic DeepSeek‑R1’s transparent reasoning, detailing background, prompt design, iterative optimization, and the broader impact on AI communication and user expression.

Alibaba Cloud Developer

Feb 19, 2025

How to Replicate DeepSeek‑R1’s Thought Process on Claude 3.5 Sonnet with Prompt Engineering

Background

DeepSeek‑R1 has attracted attention for its outstanding performance on complex reasoning tasks, offering a transparent chain‑of‑thought output and a cost‑effective architecture. The author wanted to experience similar deep‑thinking behavior on Claude 3.5 Sonnet.

Goal

To “replicate” DeepSeek‑R1’s effect on Claude, the author aimed to make Claude explicitly show its reasoning process, allowing users to verify the completeness and accuracy of their prompts and improve their own communication skills.

Key Highlights

Making Claude’s reasoning visible helps users validate and refine their prompts, enhancing communication ability.

Adapting DeepSeek‑R1’s reasoning framework improves Claude 3.5 Sonnet’s performance.

Prompt engineering can tailor the model’s style (e.g., adding emojis or a playful tone) to provide extra emotional value.

1. Attempt to “replicate” DeepSeek‑R1

Claude does not natively support DeepSeek‑R1’s chain‑of‑thought format, so the author adjusted Claude’s output to first emit a reasoning block and then the final answer, using markdown tags to separate them.

1.1 Finding Existing Resources

The only relevant resource found was Anthropic’s "Let Claude think (chain of thought prompting) to increase performance" documentation.

https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/chain-of-thought

The article explains that chain‑of‑thought prompting improves accuracy on complex tasks by guiding the model to decompose problems step‑by‑step.

Advantages of Chain‑of‑Thought

Accuracy: reduces errors by solving problems incrementally.

Coherence: produces logically structured answers.

Debugging: makes it easier to spot ambiguous prompt parts.

Limitations

Long outputs may increase latency.

Simple tasks may not need deep reasoning.

The article also mentions using <thinking> and <answer> XML‑style tags to separate reasoning from the final answer.

1.2 Failed Attempts

Initial prompts did not achieve the desired effect, prompting a redesign.

1.3 First Version of Prompt (v1)

## 任务
在回答问题之前请先 think step by step，并将你思考的内容放在 <thinking> 标签中，换行后给用户输出最终的结果。
注意 <thinking> 后和 </thinking> 前都要加上换行符。
最终的输出结果中可以适当使用多级标题、序号、换行、加粗、分割线等 markdown 标记让结果的可读性更强。
## 例子
### 例子1
用户输入：有没有和《源代码》、《黑客帝国》、《创战纪》类似的科幻片推荐一下
你的输出：
<thinking>
...（详细的思考过程）
</thinking>
---
### **高概念科幻电影推荐**
1. 《异次元骇客》 ...
---
（其余推荐列表）

Testing showed the effect was unsatisfactory.

1.4 Optimized Prompt (v2)

我主要使用 Claude 3.5 Sonnet 模型，我想让它可以模拟 DeepSeek‑R1 的深度思考过程，我的提示词如下：
<这里省略 2.2 中的提示词>
你认为上述提示词可以模拟出 DeepSeek‑R1 的深度思考过程吗？你有什么改进建议吗？

DeepSeek‑R1 evaluated the prompt and highlighted strengths (clear structure, use of tags, detailed reasoning) and suggested improvements such as deeper problem decomposition, standardized reasoning steps, explicit error‑checking, and domain‑specific adaptations.

1.5 Refined Prompt (v3)

## 任务
在回答问题之前请先按照“思考流程要求”进行思考，将你思考的内容放在 [思考开始] 和 [思考结束] 中间，换行后给用户输出最终的结果。

特别注意：
1. [思考开始] 和 [思考结束] 以及思考的每一行都要加上 markdown 的 > 标识，并且务必加上必要的换行。
2. 思考内容采用口语化、年轻女孩的口吻，可加入撒娇、鼓励和少量 emoji。
3. 思考段落之间需要自然衔接。
4. “最终的结果” 部分不受上述口吻限制。

## 思考流程要求
请严格遵循以下思考路径：
1. 问题解构：分析显性需求、隐性需求、元需求。
2. 知识图谱：调用领域常识、模型（SWOT/马斯洛）、反常识、跨学科类比。
3. 逻辑推演：构建至少三条解决方案路径并评估优劣。
4. 风险预判：识别认知偏差或信息盲区。
5. 验证机制：使用证伪测试或压力测试验证结论。
6. 表达优化：根据用户身份特征调整表达方式。

With these refinements, Claude began to produce reasoning that matched DeepSeek‑R1’s style, and the model’s accuracy on several benchmark questions improved noticeably.

2. Reflections on the Replication

Replicating DeepSeek‑R1’s transparent reasoning on Claude serves two purposes: it makes the model’s thought process observable, helping users verify prompt quality and improve their own logical expression; and it demonstrates that prompt engineering can add “emotional value” by customizing tone and style.

The experiment also shows that prompt engineering remains valuable even as models become more capable; the skill shifts from clever tricks to precise, well‑structured communication.

3. Why Not Abandon Claude?

Although DeepSeek‑R1 is technically superior in some aspects, Claude 3.5 Sonnet offers features such as Projects, long context windows, and Artifacts that enable richer interactions, making it still attractive for many users.

4. Conclusion

The author successfully used prompt engineering to visualize a deep‑thinking process on Claude 3.5 Sonnet, achieving partial replication of DeepSeek‑R1’s effect and improving answer accuracy. The work highlights the growing role of prompt engineering as a bridge between user intent and model capability, offering both functional and emotional benefits.

Future applications may extend this transparent reasoning to education, decision‑support, and other domains where understanding AI’s thought process is crucial.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Large Language Models DeepSeek Claude AI reasoning

Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.