Build Your Own LC‑3 Virtual Machine in C – A Complete Step‑by‑Step Guide
This tutorial walks you through creating a small LC‑3 virtual machine in C, covering the underlying architecture, memory and register modeling, instruction decoding, implementation of key opcodes such as ADD and LDI, trap routines, program loading, and platform‑specific details, all with full source code examples.
1. Introduction
The article explains how to write a personal virtual machine (VM) capable of executing programs written in LC‑3 assembly, such as a simple 2048 clone or a rogue‑like game. It targets programmers who want to understand computer internals and language implementation.
2. What Is a Virtual Machine?
A VM simulates core hardware components (CPU, memory, I/O) and can execute machine language directly. It can be used to model specific hardware (e.g., game consoles) or to provide a portable execution environment for software.
3. LC‑3 Architecture
LC‑3 is a teaching architecture with a 16‑bit word size, 65,536 memory locations, and 10 registers (8 general‑purpose, PC, and condition flags). The VM models memory as a simple array:
/* 65536 locations */
uint16_t memory[UINT16_MAX];Registers are stored in an array indexed by an enum:
enum {
R_R0 = 0,
R_R1,
R_R2,
R_R3,
R_R4,
R_R5,
R_R6,
R_R7,
R_PC, /* program counter */
R_COND, /* condition flags */
R_COUNT
};
uint16_t reg[R_COUNT];4. Instruction Set Overview
Each 16‑bit instruction consists of an opcode (the high 4 bits) and operands. LC‑3 defines 16 opcodes (e.g., ADD, AND, NOT, BR, JMP, JSR, LD, LDI, etc.). The article shows how to extract fields using bit shifts and masks.
5. Example: ADD Instruction
ADD adds two values and stores the result in a destination register. It supports register mode and immediate mode (5‑bit signed immediate). The implementation uses a helper sign_extend function:
uint16_t sign_extend(uint16_t x, int bit_count) {
if ((x >> (bit_count - 1)) & 1) {
x |= (0xFFFF << bit_count);
}
return x;
}ADD implementation:
{
uint16_t r0 = (instr >> 9) & 0x7; /* DR */
uint16_t r1 = (instr >> 6) & 0x7; /* SR1 */
uint16_t imm_flag = (instr >> 5) & 0x1;
if (imm_flag) {
uint16_t imm5 = sign_extend(instr & 0x1F, 5);
reg[r0] = reg[r1] + imm5;
} else {
uint16_t r2 = instr & 0x7; /* SR2 */
reg[r0] = reg[r1] + reg[r2];
}
update_flags(r0);
}6. Example: LDI (Load Indirect)
LDI loads a value from a memory address that is itself stored at a location computed from PC‑relative offset:
{
uint16_t r0 = (instr >> 9) & 0x7; /* DR */
uint16_t pc_offset = sign_extend(instr & 0x1FF, 9);
reg[r0] = mem_read(mem_read(reg[R_PC] + pc_offset));
update_flags(r0);
}7. Common Helper Functions
Updating condition flags after each register write:
void update_flags(uint16_t r) {
if (reg[r] == 0) {
reg[R_COND] = FL_ZRO;
} else if (reg[r] >> 15) { /* negative */
reg[R_COND] = FL_NEG;
} else {
reg[R_COND] = FL_POS;
}
}Memory access functions (with memory‑mapped I/O handling):
void mem_write(uint16_t address, uint16_t val) {
memory[address] = val;
}
uint16_t mem_read(uint16_t address) {
if (address == MR_KBSR) {
if (check_key()) {
memory[MR_KBSR] = 1 << 15; /* key ready */
memory[MR_KBDR] = getchar();
} else {
memory[MR_KBSR] = 0;
}
}
return memory[address];
}8. Trap Routines
LC‑3 defines a small set of system calls (TRAP) for I/O. The switch handling looks like:
switch (instr & 0xFF) {
case TRAP_GETC: /* get character, no echo */ break;
case TRAP_OUT: /* output character */ break;
case TRAP_PUTS: /* output null‑terminated string */ break;
case TRAP_IN: /* get character with echo */ break;
case TRAP_PUTSP: /* output packed bytes */ break;
case TRAP_HALT: /* halt program */ running = 0; break;
}Example implementation of PUTS (output a string stored as 16‑bit characters):
{
uint16_t *c = memory + reg[R_R0];
while (*c) {
putc((char)*c, stdout);
++c;
}
fflush(stdout);
}9. Loading Programs
Programs are stored in binary files with a 16‑bit origin followed by the instruction words. The loader reads the origin, swaps endianness, and copies the rest of the file into memory:
void read_image_file(FILE *file) {
uint16_t origin;
fread(&origin, sizeof(origin), 1, file);
origin = swap16(origin);
uint16_t max_read = UINT16_MAX - origin;
uint16_t *p = memory + origin;
size_t read = fread(p, sizeof(uint16_t), max_read, file);
while (read--) {
*p = swap16(*p);
++p;
}
}
uint16_t swap16(uint16_t x) {
return (x << 8) | (x >> 8);
}10. Memory‑Mapped Registers
Two special addresses provide keyboard status (KBSR) and data (KBDR). The mem_read function checks for pending input using select() and updates these registers accordingly.
11. Platform‑Specific Setup (Unix)
To make non‑blocking keyboard input work, the tutorial disables canonical mode and echo, restores settings on exit, and installs a SIGINT handler.
12. Running the VM
Compile the C source (e.g., lc3-vm.c) with a standard compiler, then run it with a compiled LC‑3 object file: ./lc3-vm path/to/2048.obj The VM will load the program, execute it, and interact via the defined trap routines.
13. Optional C++ Implementation
The article also sketches a generic C++ template‑based opcode dispatcher that reduces boilerplate by extracting common fields once and reusing them across instructions.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
IT Architects Alliance
Discussion and exchange on system, internet, large‑scale distributed, high‑availability, and high‑performance architectures, as well as big data, machine learning, AI, and architecture adjustments with internet technologies. Includes real‑world large‑scale architecture case studies. Open to architects who have ideas and enjoy sharing.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
