How I Built a Toy Java Virtual Machine in Rust – Lessons and Code
After learning Rust, I created a toy Java Virtual Machine called rjvm, open‑sourced on GitHub, detailing its architecture, class file parsing, bytecode execution, value modeling, instruction handling, exception processing, and a simple stop‑the‑world garbage collector, while sharing insights and code snippets.
Introduction
After spending considerable time learning Rust, I embarked on an ambitious project: writing a Java Virtual Machine (JVM) in Rust, named rjvm . The code is open‑source on GitHub.
This is a toy JVM built for learning purposes, not a production‑ready implementation. It deliberately omits several features such as generics, threads, reflection, annotations, I/O, just‑in‑time compilation, and string handling.
Implemented Core Features
Control‑flow statements (if, for, …)
Primitive and object creation
Virtual and static method invocation
Exception handling
Garbage collection
Parsing of classes from .class files in
rt.jarTest Suite Example
class StackTracePrinting {
public static void main(String[] args) {
Throwable ex = new Exception();
StackTraceElement[] stackTrace = ex.getStackTrace();
for (StackTraceElement element : stackTrace) {
tempPrint(
element.getClassName() + "::" + element.getMethodName() + " - " +
element.getFileName() + ":" + element.getLineNumber());
}
}
// We use this in place of System.out.println because we don't have real I/O
private static native void tempPrint(String value);
}The test uses real classes from OpenJDK 7's rt.jar, so java.lang.StackTraceElement is genuine.
Code Organization
The project follows a standard Rust layout split into three crates (packages):
reader : reads .class files and models their contents.
vm : the virtual machine that can execute code as a library.
vm_cli : a simple command‑line launcher to run the VM on compiled Java files.
I am considering publishing reader separately on crates.io because it may be useful to other developers.
Parsing .class Files
Java source files are compiled by javac into .class files, which are essentially ZIP archives containing bytecode. Loading a class involves reading its metadata (name, source file, superclass, interfaces, fields, methods) and the associated bytecode.
Metadata includes class name, source file name, superclass name, implemented interfaces, fields with types and annotations.
Method information includes descriptors, throws clauses, generic signatures, bytecode, exception tables, and line number tables.
The reader crate parses these structures into Rust data types (see class_file.rs ).
Method Execution
The VM’s main API is Vm::invoke, which executes a method. Execution requires a CallStack containing CallFrame objects—one per active method. Each frame holds a local variable array and an operand stack. New frames are pushed on method calls and popped when methods return.
Most methods are implemented in Java bytecode, but the VM also supports native methods (e.g., System::currentTimeMillis, System::arraycopy, Throwable::fillInStackTrace) implemented in Rust.
Value and Object Modeling
/// Models a generic value that can be stored in a local variable or on the stack.
#[derive(Debug, Default, Clone, PartialEq)]
pub enum Value<'a> {
/// An uninitialized element (should never appear on the stack).
#[default]
Uninitialized,
/// 32‑bit integral types: boolean, byte, char, short, int.
Int(i32),
/// 64‑bit long.
Long(i64),
/// 32‑bit float.
Float(f32),
/// 64‑bit double.
Double(f64),
/// Object reference.
Object(AbstractObject<'a>),
/// Null reference.
Null,
}Objects are represented by a simple Object struct holding a class reference and a Vec<Value> for fields. The current GC implementation replaces this with a lower‑level representation using raw pointers and manual casting.
Instruction Modeling
/// Represents a Java bytecode instruction.
#[derive(Clone, Copy, Debug, Eq, PartialEq)]
pub enum Instruction {
Aaload,
Aastore,
Aconst_null,
Aload(u8),
// ... (many more instructions, over 200 total)
}Each instruction may have operands; the VM fetches, decodes, and executes them, updating the program counter accordingly.
Instruction Execution
fn execute_instruction(
&mut self,
vm: &mut Vm<'a>,
call_stack: &mut CallStack<'a>,
instruction: Instruction,
) -> Result<InstructionCompleted<'a>, MethodCallFailed<'a>> {
// implementation omitted for brevity
}The execution result can be one of:
Successful execution, continue the method.
Return instruction, unwind with an optional return value.
Internal VM error.
Java exception thrown.
Exception Handling
Each catch block corresponds to an entry in the method’s exception table, specifying the PC range, handler address, and exception class. When an exception is thrown, the VM searches for a matching handler; if none is found, the exception propagates up the call stack.
enum InstructionCompleted<'a> {
ReturnFromMethod(Option<Value<'a>>),
ContinueMethodExecution,
}
enum MethodCallFailed<'a> {
InternalError(VmError),
ExceptionThrown(JavaException<'a>),
}Garbage Collection
The final milestone of rjvm is a simple stop‑the‑world garbage collector based on Cheney’s copying algorithm. Memory is split into two semi‑spaces; allocation occurs in the active space, and when it fills, live objects are copied to the other space, updating all references. This approach trades memory usage (approximately 50 % of the heap) for fast allocation and eliminates fragmentation.
Real‑world JVMs use more sophisticated generational collectors (e.g., G1, Parallel GC) that build on similar copying principles.
Conclusion
Building rjvm taught me a great deal about Rust, virtual machine design, bytecode interpretation, and low‑level memory management. While the project is now paused, the experience was rewarding and demonstrated that Rust is an excellent language for systems‑level programming.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
