Can Claude Translate the Linux Kernel to Rust? Insights, Experiments, and Costs
This article evaluates Claude's ability to translate isolated Linux kernel modules from C to Rust, presenting a detailed analysis of translation granularity, token costs, experimental results on drivers, networking, and file‑system modules, and discussing the technical and economic challenges of a full kernel rewrite.
Background
The Linux kernel comprises roughly 36 million lines of C code. Since Linux 6.1, Rust support has been added incrementally, with a few drivers and abstraction layers written in Rust. The main obstacle to a larger Rust migration is developer time: only a few hundred engineers understand both the kernel's C internals and Rust's ownership model.
Why Use an LLM?
The idea is to let a large language model (LLM) such as Claude generate initial Rust translations that experts can review, rather than writing the code from scratch.
Choosing the Right Translation Unit
Three strategies were examined:
File‑to‑file : Convert each .c file to a .rs file. This fails for most files because they depend on many headers, macros, and global state.
Subsystem‑to‑subsystem : Translate an entire subsystem (e.g., the CFS scheduler). The context window of Claude (≈200 K tokens) is still too small for large subsystems.
Modular : Use loadable kernel modules (LKMs) as the unit. Modules have clear entry/exit points ( module_init, module_exit) and limited size (<5 K lines), fitting comfortably within the LLM context.
Experimental Evaluation
Test 1 – Simple character device
Source: ~150 lines of C. Claude produced ~180 lines of Rust, correctly mapping file_operations to a Rust trait and using Box<T> for heap allocation. The translation was judged correct.
Test 2 – Network driver (virtio‑net style)
Source: ~2 400 lines of C. Claude generated ~3 100 lines of Rust. High‑level structure (device registration, NAPI loop) was accurate, but DMA buffer management required unsafe blocks and missed the invariant that buffers must not be freed while in use.
Test 3 – Simplified ext2‑style file system
Source: ~4 800 lines of C. Rust output (~5 600 lines) correctly handled superblock parsing and inode reads, but the VFS integration produced self‑referencing structs with incorrect lifetimes, preventing compilation without unsafe workarounds.
Hard Walls
Inline Assembly
≈50 000 lines of architecture‑specific inline assembly exist in the kernel. Rust’s core::arch::asm! differs from GCC’s AT&T syntax, and Claude often mixes the two, requiring expert translation.
Macro‑Heavy Code
Macros such as container_of, list_for_each_entry, READ_ONCE, and spin_lock_irqsave encode safety contracts. Claude expands them to unsafe pointer arithmetic but does not capture the underlying invariants, missing opportunities for richer Rust abstractions.
Concurrency Model
The kernel relies on spinlocks, RCU, and architecture‑specific memory ordering. While the kernel crate offers Mutex<T>, SpinLock<T>, and RCU abstractions, Claude still wraps C‑style patterns in unsafe instead of leveraging Rust’s type system.
Behavioral Equivalence
Ensuring the Rust version matches the C version in all edge cases (e.g., ext4’s journal behavior, TCP’s TIME_WAIT handling, scheduler priority interactions) is extremely difficult because many of these behaviors are undocumented and only exercised by extensive test suites.
Token Economics
Average C line ≈1.5 tokens → ~54 M input tokens for the whole kernel.
Multiple passes (initial translation, review, iteration) → ≈150 M input tokens.
Output tokens roughly equal input tokens → ≈150 M output tokens.
Claude Opus pricing (2026): $15 M per million input token, $75 M per million output token.
Estimated API cost for full translation: >$13.5 M.
Targeting only the most critical subsystems (≈350 K lines) reduces the cost to about $1.3 M.
Key Takeaways
1. Translation Granularity Determines Quality
Modules under 5 K lines yield high‑quality Rust; larger subsystems produce acceptable but imperfect results; whole‑system attempts fail.
2. Invariant Translation Is the Real Work
Converting syntax is easy; encoding C‑level invariants into Rust’s type system requires explicit prompts and human insight.
3. Correct Translation Is Not Plug‑and‑Play
Even syntactically correct Rust modules need adapter layers to match changed interfaces, error‑handling, and allocation models.
Realistic Roadmap
Phase 1 – Abstraction (Year 1) : Extend the kernel crate to cover the full API surface, using Claude to generate safe wrappers from C headers.
Phase 2 – Leaf Drivers (Years 1‑3) : Translate ~15 000 leaf drivers; most can be reviewed quickly.
Phase 3 – Filesystems (Years 2‑4) : Start with simple filesystems (tmpfs, procfs) and progress to complex ones (ext4, btrfs, XFS), focusing on VFS integration.
Phase 4 – Network Stack (Years 3‑5) : Translate TCP/IP and related components, paying special attention to edge‑case behavior.
Phase 5 – Core Subsystems (Years 5‑8) : Tackle scheduler, memory manager, and process model, the most interconnected parts.
With AI assistance, the overall timeline could shrink to 8‑10 years versus the 20‑30 years projected for organic adoption.
True Value
The exercise is less about rewriting Linux in Rust and more about building better system‑programming tools: a formal invariants language, a behavior‑equivalence testing framework, and a type‑system mapping utility. These would benefit any C‑to‑Rust migration.
Conclusion
Claude cannot rewrite the entire Linux kernel in Rust, but it can produce useful drafts for bounded modules that human experts refine. For the majority of leaf‑node components, this approach is already viable; the core kernel still demands deep human expertise, though AI will gradually reduce the remaining workload.
Code Mala Tang
Read source code together, write articles together, and enjoy spicy hot pot together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
