How I Built a Toy Java Virtual Machine in Rust – Lessons and Code

After learning Rust, I created a toy Java Virtual Machine called rjvm, open‑sourced on GitHub, detailing its architecture, class file parsing, bytecode execution, value modeling, instruction handling, exception processing, and a simple stop‑the‑world garbage collector, while sharing insights and code snippets.

21CTO
21CTO
21CTO
How I Built a Toy Java Virtual Machine in Rust – Lessons and Code

Introduction

After spending considerable time learning Rust, I embarked on an ambitious project: writing a Java Virtual Machine (JVM) in Rust, named rjvm . The code is open‑source on GitHub.

This is a toy JVM built for learning purposes, not a production‑ready implementation. It deliberately omits several features such as generics, threads, reflection, annotations, I/O, just‑in‑time compilation, and string handling.

Implemented Core Features

Control‑flow statements (if, for, …)

Primitive and object creation

Virtual and static method invocation

Exception handling

Garbage collection

Parsing of classes from .class files in

rt.jar

Test Suite Example

class StackTracePrinting { 
    public static void main(String[] args) { 
        Throwable ex = new Exception(); 
        StackTraceElement[] stackTrace = ex.getStackTrace(); 
        for (StackTraceElement element : stackTrace) { 
            tempPrint( 
                element.getClassName() + "::" + element.getMethodName() + " - " + 
                element.getFileName() + ":" + element.getLineNumber()); 
        } 
    } 
    // We use this in place of System.out.println because we don't have real I/O 
    private static native void tempPrint(String value); 
}

The test uses real classes from OpenJDK 7's rt.jar, so java.lang.StackTraceElement is genuine.

Code Organization

The project follows a standard Rust layout split into three crates (packages):

reader : reads .class files and models their contents.

vm : the virtual machine that can execute code as a library.

vm_cli : a simple command‑line launcher to run the VM on compiled Java files.

I am considering publishing reader separately on crates.io because it may be useful to other developers.

Parsing .class Files

Java source files are compiled by javac into .class files, which are essentially ZIP archives containing bytecode. Loading a class involves reading its metadata (name, source file, superclass, interfaces, fields, methods) and the associated bytecode.

Metadata includes class name, source file name, superclass name, implemented interfaces, fields with types and annotations.

Method information includes descriptors, throws clauses, generic signatures, bytecode, exception tables, and line number tables.

The reader crate parses these structures into Rust data types (see class_file.rs ).

Method Execution

The VM’s main API is Vm::invoke, which executes a method. Execution requires a CallStack containing CallFrame objects—one per active method. Each frame holds a local variable array and an operand stack. New frames are pushed on method calls and popped when methods return.

Most methods are implemented in Java bytecode, but the VM also supports native methods (e.g., System::currentTimeMillis, System::arraycopy, Throwable::fillInStackTrace) implemented in Rust.

Value and Object Modeling

/// Models a generic value that can be stored in a local variable or on the stack.
#[derive(Debug, Default, Clone, PartialEq)]
pub enum Value<'a> {
    /// An uninitialized element (should never appear on the stack).
    #[default]
    Uninitialized,
    /// 32‑bit integral types: boolean, byte, char, short, int.
    Int(i32),
    /// 64‑bit long.
    Long(i64),
    /// 32‑bit float.
    Float(f32),
    /// 64‑bit double.
    Double(f64),
    /// Object reference.
    Object(AbstractObject<'a>),
    /// Null reference.
    Null,
}

Objects are represented by a simple Object struct holding a class reference and a Vec<Value> for fields. The current GC implementation replaces this with a lower‑level representation using raw pointers and manual casting.

Instruction Modeling

/// Represents a Java bytecode instruction.
#[derive(Clone, Copy, Debug, Eq, PartialEq)]
pub enum Instruction {
    Aaload,
    Aastore,
    Aconst_null,
    Aload(u8),
    // ... (many more instructions, over 200 total)
}

Each instruction may have operands; the VM fetches, decodes, and executes them, updating the program counter accordingly.

Instruction Execution

fn execute_instruction(
    &mut self,
    vm: &mut Vm<'a>,
    call_stack: &mut CallStack<'a>,
    instruction: Instruction,
) -> Result<InstructionCompleted<'a>, MethodCallFailed<'a>> {
    // implementation omitted for brevity
}

The execution result can be one of:

Successful execution, continue the method.

Return instruction, unwind with an optional return value.

Internal VM error.

Java exception thrown.

Exception Handling

Each catch block corresponds to an entry in the method’s exception table, specifying the PC range, handler address, and exception class. When an exception is thrown, the VM searches for a matching handler; if none is found, the exception propagates up the call stack.

enum InstructionCompleted<'a> {
    ReturnFromMethod(Option<Value<'a>>),
    ContinueMethodExecution,
}

enum MethodCallFailed<'a> {
    InternalError(VmError),
    ExceptionThrown(JavaException<'a>),
}

Garbage Collection

The final milestone of rjvm is a simple stop‑the‑world garbage collector based on Cheney’s copying algorithm. Memory is split into two semi‑spaces; allocation occurs in the active space, and when it fills, live objects are copied to the other space, updating all references. This approach trades memory usage (approximately 50 % of the heap) for fast allocation and eliminates fragmentation.

Real‑world JVMs use more sophisticated generational collectors (e.g., G1, Parallel GC) that build on similar copying principles.

Conclusion

Building rjvm taught me a great deal about Rust, virtual machine design, bytecode interpretation, and low‑level memory management. While the project is now paused, the experience was rewarding and demonstrated that Rust is an excellent language for systems‑level programming.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

JVMbytecodegarbage-collectionsystems-programming
21CTO
Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.