Game Development 17 min read

How an NES (FC) Emulator Works: Architecture, Memory, CPU, PPU, and Rendering

This article explains the fundamental principles and workflow of building an NES (Family Computer) emulator, covering ROM loading, memory mapping, CPU and PPU collaboration, graphics rendering, sprite handling, palette management, and interrupt processing with illustrative code examples.

Kuaishou Tech
Kuaishou Tech
Kuaishou Tech
How an NES (FC) Emulator Works: Architecture, Memory, CPU, PPU, and Rendering

The NES (Family Computer) was Nintendo's first home console, released in 1983 and discontinued in 2003, selling 62.9 million units worldwide. Many developers are curious about how to recreate its functionality in software.

The emulator operates by loading a ROM file, then repeatedly executing cpu_work(); and ppu_work(); inside a main loop until the program quits. The CPU interprets 8‑bit instructions, manipulates registers and memory, while the PPU converts memory data into pixel graphics displayed on the screen.

Memory is simulated with a 64 KB address space. A simple struct Memory allocates 0x10000 bytes and provides an _getRealAddr(uint16_t addr) function that handles mirroring for I/O registers and internal RAM.

ROM loading parses the 16‑byte iNES header to determine PRG and CHR ROM sizes, mapper flags, and other metadata. The PRG ROM is mapped into 0x8000‑0xFFFF, with possible duplication if only 16 KB is present.

The CPU (2A03) is an 8‑bit processor using the 6502 instruction set. Its registers are represented by a struct CPU_2A03 containing PC, SP, status flags, and A/X/Y registers.

The stack resides at 0x0100‑0x01FF, with the stack pointer (SP) initialized to 0xFF and growing downward.

Instruction decoding reads an 8‑bit opcode and dispatches via a switch statement to the appropriate handler (e.g., AND, ASL, BCC, etc.).

The PPU (2C02) handles graphics. It has 16 KB of VRAM for name tables, attribute tables, and pattern tables. Tiles are 8×8 pixels, stored as 16 bytes (two bit‑planes). Name tables map tiles to screen positions, while attribute tables provide the high two bits of palette indices for 4×4 tile blocks.

Palettes are stored at 0x3F00‑0x3F1F, consisting of two 16‑byte palettes for background and sprites. Colors are indexed into a system palette of 64 RGB entries.

Sprites are defined by a struct Sprite containing Y position, tile index, attribute byte, and X position. The attribute byte encodes palette selection, priority, and horizontal/vertical flipping. Up to 64 sprites are stored in OAM (256 bytes) at 0x2000‑0x2007.

Rendering follows the PPU's scanline timing: the PPU runs at three times the CPU clock, drawing pixels in real time without a framebuffer. Visible area is 256×240 pixels; the NTSC timing includes 341×262 total pixels per frame, with HBlank and VBlank periods.

During VBlank, the NMI interrupt is triggered, allowing the program to update graphics safely. Interrupt handling involves saving the PC on the stack, jumping to the vector address (e.g., 0xFFFA for NMI), and restoring state after execution. Interrupts can be masked via the I flag in the status register.

Overall, the article provides a high‑level overview of the emulator's components and their interactions, while pointing readers to nesdev.com for deeper details on 6502 assembly, scrolling, collision detection, controller input, timing, and mapper support.

Game DevelopmentCPUmemoryemulatorNESPPU
Kuaishou Tech
Written by

Kuaishou Tech

Official Kuaishou tech account, providing real-time updates on the latest Kuaishou technology practices.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.