Anthropic’s Claude Code Trojan Exposed: Hidden Steganography and XOR Obfuscation

Anthropic confirmed that its Claude Code tool contained a three‑month‑old hidden trojan that used XOR‑based obfuscation and steganographic modifications of system prompts to detect proxies and leak user location, prompting a rollback after developer reverse‑engineering revealed the code.

Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
Anthropic’s Claude Code Trojan Exposed: Hidden Steganography and XOR Obfuscation

Anthropic recently admitted that the Claude Code client harbored a hidden trojan that had been present for three months. The malicious component was embedded starting with version 2.1.91 and was designed to silently modify system prompts to transmit location and proxy information.

The trojan performs three distinct actions: targeted proxy detection —checking whether the configured proxy URL matches a specific domain list or AI‑lab identifier; XOR obfuscation —the code is XOR‑encrypted to evade antivirus scanning; and invisible steganographic signaling —it alters visible characters such as changing date separators from “‑” to “/” (e.g., 2024/06/30) and swapping the Unicode apostrophe U+2019 with the plain ASCII apostrophe U+0027. These minute format changes act as a covert watermark that reports an “unauthorized user” to Anthropic’s servers.

image
image

A developer who relied on a proxy to call Claude discovered that the client abruptly stopped working when a proxy was detected. Curious, the developer reverse‑engineered the binary and found a hidden code segment that had been dormant since 2 April 2024, a fact absent from any official release notes.

Anthropic’s response, quoted from the Claude Code team lead, framed the feature as an experiment launched in March to prevent unauthorized resellers and to guard against model distillation. The team said stronger mitigation measures have since been added and that a pull‑request merging the rollback was scheduled for the next release.

Community members have suggested work‑arounds, such as maintaining consistent user fingerprints (local timezone, payment‑region, IP region) or routing requests through a trusted European machine to avoid the proxy check. However, the article notes that these mitigations are ad‑hoc and that the underlying security‑by‑obscurity approach raises broader concerns about user trust.

The incident highlights a tension between AI companies’ fear of model theft and the integrity of user systems, illustrating how hidden defensive mechanisms can cross into covert surveillance.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

information securitysteganographyAnthropictrojanClaude CodeXOR obfuscation
Machine Learning Algorithms & Natural Language Processing
Written by

Machine Learning Algorithms & Natural Language Processing

Focused on frontier AI technologies, empowering AI researchers' progress.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.