How to Build a Toy BASIC‑to‑Go Compiler in a Weekend
An experienced programmer shares a weekend project that builds a simplified BASIC variant, toybasic, and a three‑stage compiler written in Go—using Nex for lexical analysis, goyacc for parsing, and custom code generation—to translate BASIC programs into Go code, complete with examples and source links.
Introduction
A senior programmer with a computer‑science degree describes how, on a rainy weekend, they built a complete compiler from scratch, creating a simplified BASIC variant called toybasic that translates BASIC source code into Go.
The motivation was a love for BASIC and the existence of TinyBASIC; the author stripped out the INPUT statement to make the language even simpler. All source code is available on GitHub.
Demo Code
Example program ( example/hello.bas)
PRINT "Hello, world."
LET x = (3 * 2) + 3
LET x = x + 1
IF x == 10 THEN PRINT "Ten!"
PRINT x
IF x >= 15 THEN GOTO 70
GOTO 30
ENDRunning the program produces:
$ ./toybasic < example/hello.bas
$ go run out.go
Hello, world.
Ten!
10
11
12
13
14
15Compiler Structure
The compiler consists of three stages:
Lexical Analyzer – tokenizes the source characters.
Parser – builds an abstract syntax tree (AST) from the tokens.
Compiler – walks the AST and emits equivalent Go code.
Lexical Analyzer
The lexer is generated with nex, a Go clone of the classic lex tool. It uses a configuration file of regular expressions to recognize keywords, identifiers, numbers, strings, etc.
/PRINT/ { return PRINT }
/==/ { return EQ }
/[0-9]+/ { lval.num = strconv.Atoi(yylex.Text()); return INTEGER }
/[0-9]+\.[0-9]*/ { lval.dec = strconv.ParseFloat(yylex.Text(), 64); return DECIMAL }
/"[^"]*"/ { lval.s = yylex.Text(); return STRING }nex then generates thousands of lines of Go code that implement a deterministic finite automaton for tokenization.
Parser
The parser is built with goyacc, the Go version of yacc. The grammar is a small subset of BASIC, inspired by TinyBASIC, with added string support and without INPUT.
statement:
PRINT expr_list { $$ = opr(PRINT, 1, $2); }
| IF expression relop expression THEN statement { $$ = opr(IF, 4, $2, $3, $4, $6); }
| GOTO expression { $$ = opr(GOTO, 1, $2); }
| LET v '=' expression { $$ = opr(LET, 2, $2, $4); }
| END { $$ = opr(END, 0); }
;The parser produces an AST such as:
PrintOp
|
ListOp --------.
| |
StringOp * Op --------.
| |
GroupOp INTEGER
|
InfixOp
|
INTEGER + INTEGERA %union declaration defines the value types for tokens and AST nodes.
%union {
v string /* Variable */
s string /* String */
num int /* Integer constant */
dec float64 /* Decimal constant */
node Node /* Node in the AST */
};Code Generator (Compiler)
The final stage is hand‑written Go code that defines a Node interface and concrete types such as PrintOp. Each node implements a Generate() method that emits Go source.
type Node interface { Generate() }
...
type PrintOp struct { expression Node }
func (op PrintOp) Generate() {
fmt.Fprint(writer, "fmt.Println(")
op.expression.Generate()
fmt.Fprintln(writer, ")")
}Running the compiler on the example program yields a Go file that, when executed, reproduces the original BASIC output.
Conclusion
The project demonstrates that a full BASIC‑to‑Go compiler can be built in a single weekend using existing lexer and parser generators, with all core components written in Go. It serves as a practical learning exercise for language implementation and compiler construction.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
21CTO
21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
