How to Build a Simple JSON Parser in Java: A Step‑by‑Step Guide
This article walks through the design and implementation of a lightweight JSON parser in Java, covering the underlying lexical and syntax analysis, token definitions, core parsing algorithms, testing strategies, and a brief demonstration of JSON beautification, providing developers with a clear understanding of JSON processing fundamentals.
Background
JSON (JavaScript Object Notation) is a lightweight data interchange format. Compared with XML, JSON offers better readability and smaller size, making it popular in web development. Developers are encouraged to understand JSON fundamentals and its parsing mechanisms.
JSON Parser Implementation Principles
A JSON parser essentially builds a state machine based on JSON grammar rules, taking a JSON string as input and producing a JSON object. The parsing process includes lexical analysis and syntax analysis.
Example JSON string:
{<br/> "name": "小明",<br/> "age": 18<br/>}Lexical analysis yields tokens such as:
{, name, :, 小明, ,, age, :, 18, }
Lexical analyzer input/output diagram.
After tokenization, syntax analysis checks the token sequence against JSON grammar to ensure structural validity.
Lexical Analysis
Lexical analysis converts a JSON string into a stream of tokens according to construction rules. JSON defines the following token types:
BEGIN_OBJECT ({)
END_OBJECT (})
BEGIN_ARRAY ([)
END_ARRAY (])
NULL (null)
NUMBER
STRING
BOOLEAN (true/false)
SEP_COLON (:)
SEP_COMMA (,)
TokenType enum definition:
public enum TokenType {<br/> BEGIN_OBJECT(1),<br/> END_OBJECT(2),<br/> BEGIN_ARRAY(4),<br/> END_ARRAY(8),<br/> NULL(16),<br/> NUMBER(32),<br/> STRING(64),<br/> BOOLEAN(128),<br/> SEP_COLON(256),<br/> SEP_COMMA(512),<br/> END_DOCUMENT(1024);<br/> // constructor and getter omitted for brevity<br/>}Token class encapsulates type and literal value:
public class Token {<br/> private TokenType tokenType;<br/> private String value;<br/> // other code omitted<br/>}CharReader reads characters from a Reader:
public class CharReader {<br/> public char peek() throws IOException { /* ... */ }<br/> public char next() throws IOException { /* ... */ }<br/> public void back() { /* ... */ }<br/> public boolean hasMore() throws IOException { /* ... */ }<br/> // other methods omitted<br/>}Tokenizer uses CharReader to produce a TokenList:
public class Tokenizer {<br/> public TokenList tokenize(CharReader charReader) throws IOException { /* ... */ }<br/> private Token start() throws IOException { /* ... */ }<br/> // other helper methods omitted<br/>}Key lexical rule: based on the first character, the tokenizer decides token type (e.g., '{' → BEGIN_OBJECT, 'n' → NULL, '"' → STRING, digits → NUMBER, etc.).
Syntax Analysis
Syntax analysis consumes the token list and builds JsonObject or JsonArray structures according to the grammar:
object = {} | { members }<br/>members = pair | pair , members<br/>pair = string : value<br/>array = [] | [ elements ]<br/>elements = value | value , elements<br/>value = string | number | object | array | true | false | nullJsonObject and JsonArray helper classes:
public class JsonObject {<br/> private Map<String, Object> map = new HashMap<>();<br/> public void put(String key, Object value) { map.put(key, value); }<br/> public Object get(String key) { return map.get(key); }<br/> // other methods omitted<br/>} public class JsonArray implements Iterable {<br/> private List list = new ArrayList();<br/> public void add(Object obj) { list.add(obj); }<br/> public Object get(int index) { return list.get(index); }<br/> // other methods omitted<br/>}Core parsing method parseJsonObject processes tokens recursively, handling objects, arrays, literals, and enforcing expected token types.
Parsing flow summary:
Read a token and verify it matches the expected type.
If valid, update the expected token set; otherwise, throw an exception.
Repeat until all tokens are consumed or an error occurs.
Example token sequence {, id, :, 1, } demonstrates how the parser transitions between expected token states.
Testing and Demonstration
Tests use a sample JSON file (e.g., music.json) to verify correctness. The article also shows a JSON beautification example with a simulated hero data image.
The beautification code is provided as a supplemental feature.
Conclusion
The article presents a simple JSON parser implementation for educational purposes, acknowledges its limitations, and invites readers to contribute improvements. Source code is available on GitHub.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Programmer DD
A tinkering programmer and author of "Spring Cloud Microservices in Action"
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
