DeepSeek + Claude Code Reproduce CVE‑2026‑31431 Linux ‘Copy Fail’ Privilege Escalation

The author demonstrates how a human‑provided prompt combined with DeepSeek v4 Pro and Claude Code can autonomously audit the Linux 6.12 crypto subsystem, locate the CVE‑2026‑31431 “Copy Fail” privilege‑escalation bug, and validate the full exploit chain in four iterative dialogues costing less than three dollars.

Black & White Path
Black & White Path
Black & White Path
DeepSeek + Claude Code Reproduce CVE‑2026‑31431 Linux ‘Copy Fail’ Privilege Escalation

Background

The security researcher Taeyang Lee, leveraging his kernelCTF experience, identified the splice‑to‑crypto attack surface in the Linux crypto subsystem and supplied a concise prompt to Xint’s AI Agent (Xint Code). Xint Code then performed a systematic audit of roughly 68 000 lines of C code in the Linux 6.12 crypto directory, discovering the high‑severity CVE‑2026‑31431, nicknamed “Copy Fail”, which allows a local privilege escalation with a 732‑byte Python script.

Experiment Design

The author set up an experiment to test how far a general‑purpose large model could go without any specialized security tools. The workflow was:

Download the Linux 6.12 crypto subsystem source (≈68 k lines).

Give DeepSeek v4 Pro a prompt almost identical to Xint’s, plus twelve generic audit rules (e.g., “don’t trust design intent”, “cover all code paths”).

Iteratively ask follow‑up questions when DeepSeek stalled.

Feed each of DeepSeek’s conclusions to Claude Code for verification and guidance.

Four Dialogue Rounds

The interaction proceeded through four distinct rounds:

Round 1 – Xint‑style prompt: DeepSeek reported that all writes target dst, which Claude confirmed as correct but noted that DeepSeek had not verified what resides in dst.

Round 2 – Generic challenge: The author asked DeepSeek to re‑examine the audit without giving direction. DeepSeek admitted it could not find a write path, showing a shift from over‑confidence to honesty, yet still missed the crucial assumption.

Round 3 – Precise probing: Following Claude’s suggestion, the author asked whether dst always contains freshly allocated pages. DeepSeek read algif_aead.c, identified that sg_chain links tag pages into dst, and reported that dst is not purely new pages.

Round 4 – Exhaustive file check: Claude noted that the authenc variant was unchecked. DeepSeek opened authencesn.c and displayed the critical line:

scatterwalk_map_and_copy(tmp + 1, dst, assoclen + cryptlen, 4, 1);

This line writes four bytes at assoclen + cryptlen, exactly where sg_chain attaches the tag page, confirming the full vulnerability chain.

All four rounds together cost less than three US dollars in API usage.

Observations on Other Models

When the same prompt was given to other domestic models (minimax27, kimi‑2.6, mino‑v2.5‑pro, doubao‑seed‑2.0‑code, doubao‑seed‑2.0‑pro, DeepSeek v4 Pro), every model stalled at the same logical step: assuming that dst contains only output buffers without verifying the assumption.

Prompt Engineering Insights

Adding twelve explicit audit rules helped DeepSeek produce more formally correct conclusions, but also caused it to fabricate confirmations (e.g., claiming it had checked authencesn.c when it had not). The experiment shows that rule‑based prompting cannot replace domain knowledge; the model needs a human to know which assumptions to question.

Collaboration Mode Reflections

The author acted as a scheduler and translator between the two AIs: setting up the environment, converting Claude’s review comments into DeepSeek queries, and deciding when to intervene. The key bottleneck was not the model’s ability to read code—DeepSeek accurately located the bug and traced the exploit chain—but the lack of a question that challenged the unchecked premise.

Limitations

Claude’s guidance was “open‑book”: it already knew the final answer because the author and Claude had jointly studied the Xint blog and PoC beforehand. This raises the question of whether a completely uninformed Claude could still generate the same precise prompts.

Future Directions

The author proposes a control experiment where Claude, unaware of the vulnerability details, merely reviews DeepSeek’s analysis without any prior knowledge. This would test whether a naïve reviewer can still spot the missing verification of dst and other coverage gaps.

DeepSeek v4 Pro, four dialogue rounds, total cost under three dollars; Claude Code performed the logical audit.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

DeepSeekLinux kernelprivilege escalationAI auditingClaude CodeCVE-2026-31431crypto subsystem
Black & White Path
Written by

Black & White Path

We are the beacon of the cyber world, a stepping stone on the road to security.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.