Fundamentals 24 min read

How to Build a WebAssembly Interpreter: From Binary Decoding to Stack‑Based Execution

This article walks through the design and implementation of a WebAssembly interpreter, covering Wasm fundamentals, binary module structure, decoding into an in‑memory representation, stack‑based virtual machine execution, call‑stack management, and concrete code examples from the open‑source project.

NetEase Cloud Music Tech Team
NetEase Cloud Music Tech Team
NetEase Cloud Music Tech Team
How to Build a WebAssembly Interpreter: From Binary Decoding to Stack‑Based Execution

Background

The author implemented a minimal WebAssembly (Wasm) interpreter to study Wasm internals. The source code is hosted at https://github.com/mcuking/wasmc.

Wasm Basics

Wasm is a low‑level, stack‑based binary instruction format that serves as a portable compilation target for languages such as C, C++, and Rust. It can run in browsers and in non‑browser environments, offering near‑native performance while remaining platform‑independent.

Binary format (.wasm) is the primary encoding.

Text format (.wat) provides a human‑readable assembly‑like syntax.

In‑memory format represents the module after decoding, typically as C structs.

Module instance is the runtime object created from the in‑memory format.

Wasm Module Structure

A Wasm binary starts with a magic number and version, followed by up to twelve sections ordered by increasing section ID. The sections are:

Custom

Type

Import

Function

Table

Memory

Global

Export

Start

Element

Code

Data

All sections except the custom section must appear at most once and in the order listed above. This ordering enables streaming compilation.

Decoder Stage

The decoder reads the binary, validates section ordering, and populates an internal module structure. Key operations include:

Parsing each section and filling fields such as type signatures, function indices, and code bodies.

Reading variable‑length integers encoded with LEB128. The helper read_LEB in utils.c implements both unsigned and signed variants.

Relevant source files: module.c – function load_module performs the full decoding. utils.c – function read_LEB handles LEB128 decoding.

Execution Stage

The interpreter executes the in‑memory module using a stack‑based virtual machine (VM). The VM follows the classic fetch‑decode‑execute cycle for each opcode.

Virtual Machine Concepts

The VM simulates a CPU. Wasm uses a single operand stack; most instructions pop their operands from this stack and push results back onto it.

Instruction Set

Wasm defines 178 opcodes grouped into five categories: control, param, variable, memory, and numeric. Each opcode is a single byte optionally followed by immediate operands.

Operand Stack and Immediate Values

Example: f32.sub pops two 32‑bit floats, computes the difference, and pushes the result. Immediate values such as i32.const 3 are encoded directly after the opcode.

Call Stack and Stack Frames

Function calls are managed by a call stack composed of stack frames. Each frame stores:

A reference to the function’s metadata.

A pointer into the shared operand stack that marks the frame’s stack‑base.

The return address (program counter) for the caller.

All frames share a single operand stack, allowing parameters to be passed without copying.

Example Wasm Module

(module
    (func $compute (result i32)
        i32.const 13          ;; push 13
        f32.const 42.0        ;; push 42.0
        call $add             ;; result 55
        f32.const 10.0        ;; push 10.0
        call $add             ;; result 65
    )
    (func $add (param $a i32) (param $b f32) (result i32)
        i32.get_local $a
        f32.get_local $b
        i32.trunc_f32_s      ;; truncate float to i32
        i32.add              ;; add two i32 values
    )
    (export "compute" (func $compute))
    (export "add" (func $add))
)

The interpreter’s main loop in interpreter.c uses a while loop and a switch statement to fetch, decode, and execute each opcode.

Implementation Details

LEB128 Encoding

Lengths, indices, and other integers are encoded with LEB128 (Little‑Endian Base‑128). It stores 7 bits per byte; the most‑significant bit indicates continuation. This variable‑length encoding reduces binary size for small values. The project’s read_LEB function handles both unsigned and signed variants.

Module Data Structure

After decoding, all module information is stored in a single C struct named module. The struct contains arrays for type signatures, function bodies, import/export tables, memory limits, table entries, globals, and data segments. The layout mirrors the order of sections in the binary.

Validation Integrated with Decoding/Execution

The interpreter does not have a separate validation pass. Instead, it performs checks at the point where the relevant data is processed:

During decoding it verifies that section IDs are legal and appear in the correct order.

During execution it checks that function call signatures match (parameter count, types, and return arity).

When instantiating a module it allocates memory and table space, records function entry points, and builds the unified module object.

Interpreter Loop

The core loop looks roughly like:

while (pc < code_end) {
    uint8_t opcode = *pc++;
    switch (opcode) {
        case OP_I32_CONST:
            int32_t value = read_leb_i32(&pc);
            push_i32(value);
            break;
        case OP_F32_SUB:
            float b = pop_f32();
            float a = pop_f32();
            push_f32(a - b);
            break;
        case OP_CALL:
            uint32_t func_idx = read_leb_u32(&pc);
            call_function(func_idx);
            break;
        // ... other opcodes ...
    }
}

Each case implements the semantics defined by the Wasm specification. The call_function routine creates a new stack frame, copies arguments from the operand stack, and jumps to the target function’s code offset.

Key Repository References

Project repository: https://github.com/mcuking/wasmc

Utility functions (LEB128): https://github.com/mcuking/wasmc/blob/master/source/utils.c

Module loader: https://github.com/mcuking/wasmc/blob/master/source/module.c

Opcode definitions: https://github.com/mcuking/wasmc/blob/master/source/opcode.h

Interpreter implementation: https://github.com/mcuking/wasmc/blob/master/source/interpreter.c

Conclusion

The interpreter demonstrates the core Wasm pipeline: binary decoding, on‑the‑fly validation, and stack‑based execution without JIT optimizations. Its straightforward design and extensive comments make it a useful reference for anyone learning Wasm internals.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

wasmWebAssemblylow-level programminginterpreterBinary DecodingStackVM
NetEase Cloud Music Tech Team
Written by

NetEase Cloud Music Tech Team

Official account of NetEase Cloud Music Tech Team

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.