Databases 16 min read

Inside SQLite’s Virtual Machine: How VDBE Executes SQL with Lemon and LR Parsing

This article explains SQLite's Virtual Database Engine (VDBE) architecture, its LR‑based grammar compilation using Lemon, compares it with Lua's virtual machine, and details the instruction set, execution flow, and stack versus register implementations.

WeChat Client Technology Team
WeChat Client Technology Team
WeChat Client Technology Team
Inside SQLite’s Virtual Machine: How VDBE Executes SQL with Lemon and LR Parsing

1 Preface

This article introduces SQLite's virtual machine VDBE and compares it with Lua's VM to better understand the differences between virtual machines.

1.1 Prerequisite Knowledge

Virtual‑machine design requires compiler‑theory fundamentals; a brief review of LR grammars is provided.

1.1.1 Grammar

LR grammar was proposed by D. Knuth in 1965. Variants such as LR(0), SLR(1), LR(1) and LALR(1) are described, with their characteristics and typical use cases.

YACC is a tool that generates LALR(1) parsers from BNF grammar. SQLite uses parse.y as its grammar file.

Lemon is SQLite's own LALR(1) parser generator, similar to YACC but with improvements: clearer syntax, no global variables, and re‑entrancy.

Example syntax differences: expr ::= expr PLUS expr { $$ = $1 + $3; } (YACC) expr(A) ::= expr(B) PLUS expr(C). { A = B + C; } (Lemon)

2 Elements of a Virtual Machine

1. Language and Grammar – SQLite uses SQL as its language; Lua uses its own script language.

2. Grammar Compiler – SQLite uses Lemon, while early Lua versions used YACC.

3. Instructions and Program – Instruction sequences for INSERT statements are illustrated.

4. Executor and Runtime Environment – Entry points are sqlite3VdbeExec for SQLite and lua_execute for Lua; they maintain a runtime stack, program counter (PC), and registers.

3 SQLite Virtual Machine

3.1 Architecture

Figure 1
Figure 1

3.2 SQL Statement Execution Flow

Demo SQL statements: create table examp(one text, two int); insert into examp values ('hello', 98); The flowchart is shown below.

Figure 2
Figure 2

3.3 SQL Statement Compilation

SQL is compiled into VDBE bytecode by the parser generated from parse.y; the entry function is sqlite3Parser. All SQL statements are transformed into bytecode before execution.

Relevant grammar rules (excerpt):

cmd ::= insert_cmd(R) INTO fullname(X) inscollist_opt(F) VALUES LP itemlist(Y) RP. { sqlite3Insert(pParse,X,Y,0,F,R); }
cmd ::= insert_cmd(R) INTO fullname(X) inscollist_opt(F) select(S). { sqlite3Insert(pParse,X,0,S,F,R); }

3.4 VDBE Instructions

Each VDBE instruction consists of an opcode and up to three operands (P1, P2, P3). Example instruction list for the INSERT demo is shown below.

Figure 3
Figure 3

Explanation of selected instructions:

0|Goto|0|11| – jump to instruction 11

1|Integer|0|0| – push database ID 0 onto the stack

2|OpenWrite|0|5| – open a writable cursor (P1=0 table ID, P2=5 root page)

3|SetNumColumns|0|2| – set number of columns

4|NewRowid|0|0| – allocate a new rowid and push it

5|String8|0|0|"hello"| – push string "hello"

6|Integer|98|0| – push integer 98

7|MakeRecord|2|0|ti| – combine top two stack elements into a binary record

8|Insert|0|3| – insert the record into the table and clear the stack

9|Close|0|0| – close the cursor

10|Halt|0|0| – sync dirty pages and finish

11|Transaction|0|1| – begin a transaction (acquire write lock)

3.4.1 Stack and Registers

Virtual machines can be stack‑based (e.g., JVM, Python, VDBE) or register‑based. Stack‑based VMs push operands onto a stack and later pop them for operations, which simplifies compilation but may increase instruction count. Register‑based VMs allocate registers beforehand and operate directly on them, yielding more compact code at the cost of register allocation complexity.

Lua used a stack‑based VM before version 5.0 and switched to a register‑based VM thereafter.

3.4.2 Comparison with Lua

Instruction sets of Lua 1.1 (stack‑based) and Lua 5.2 (register‑based) are compared.

Lua 1.1 example (image):

Figure 4
Figure 4

Lua 5.2 instruction categories (image):

Figure 5
Figure 5

Lua 5.2 uses 32‑bit unsigned integers for opcodes; 40 distinct opcodes are defined in lopcodes.h. The register‑based design yields smaller binaries and faster execution.

3.5 VDBE Execution Engine

The core execution loop of sqlite3VdbeExec fetches the current opcode, updates the program counter, and dispatches via a switch statement. A fragment of the loop is shown below:

for(pc = p->pc; rc == SQLITE_OK; pc++) {
    pOp = &p->aOp[pc];
    switch(pOp->opcode) {
        case OP_Goto: {
            CHECK_FOR_INTERRUPT;
            pc = pOp->p2 - 1;
            break;
        }
        /* ... other opcodes ... */
    }
}

VDBE responsibilities include storing compiled bytecode, executing each opcode's action, maintaining the runtime stack, handling the program counter, and managing transactions.

4 References

《程序设计语言编译原理》, 陈火旺等.

《lex与yacc》, John R. Levine et al.

SQLite source code (version 3.2.8).

Lua source code (various versions).

5 Appendix

Additional material on LR(0), SLR(1), LR(1), LALR(1) grammars.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Virtual MachineSQLiteLuaLemonLR ParsingVDBE
WeChat Client Technology Team
Written by

WeChat Client Technology Team

Official account of the WeChat mobile client development team, sharing development experience, cutting‑edge tech, and little‑known stories across Android, iOS, macOS, Windows Phone, and Windows.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.