Build a JCE Parser with PEG.js for Node.js BFF Framework
This tutorial explains how to use PEG.js to create a JavaScript parser that converts JCE protocol definitions into Node.js syntax, covering PEG.js basics, grammar rules, recursion, struct and interface parsing, and assembling a complete JCE parser.
When developing a frontend BFF framework, you may need to translate the team's JCE protocol (similar to ProtoBuf) into Node.js syntax; this article demonstrates how to use
PEG.jsto parse JCE files and generate an AST.
Instead of relying on fragile regular expressions, PEG.js provides a more maintainable parser generator for JavaScript.
PEG.jsis a JavaScript parser generator that can handle complex languages and easily build converters, interpreters, and compilers. Its grammar is friendly to front‑end engineers and only requires basic regex knowledge.
The PEG.js grammar consists of a set of rules parsed from top to bottom; the start rule is the root, and all other rules must be reachable from it.
Each rule looks like a variable declaration, with a name and a parsing expression that can be a regex or another rule.
<code>// additive.pegjs
start = additive
additive = left:multiplicative "+" right:additive { return left + right; }
/ multiplicative
multiplicative = left:primary "*" right:multiplicative { return left * right; }
/ primary
primary = integer
/ "(" additive:additive ")" { return additive; }
integer "integer" = digits:[0-9]+ { return parseInt(digits.join(""), 10); }
</code>The above grammar defines mixed addition and multiplication, allowing expressions such as
(2+7)*8to be parsed and evaluated.
<code>integer = [0-9] // matches a single digit
// "1" -> "1"
// "12" -> error: expected end of input
integer = [0-9]+ // matches one or more digits
// "12" -> ["1","2"]
// "" -> error: expected [0-9]
integer = [0-9]* // matches zero or more digits
// "124" -> ["1","2","4"]
// "" -> []
</code>In PEG.js, the
+operator means “match at least one”, while
*means “match zero or more”. The parser returns an array of matched tokens, and you can customize the return value with a JavaScript function.
<code>integer = digits:[0-9]+ { return digits.join(); }
// "124" -> "124"
</code>Recursion is essential for describing nested structures. For example:
<code>commaSeparatedIntegerList = integer "," commaSeparatedIntegerList
/ integer
integer = [0-9]
</code>Parsing "1,2" yields the array
["1", ",", "2"]. Literal matches produce JavaScript strings, while repeated sub‑expressions produce arrays.
Next, we define rules for
structand
interfacesections of a JCE file.
<code>module MTT {
struct HelloReq {
0 require int id;
};
struct HelloRsp {
0 require int iCode;
1 require string sMessage;
};
interface Hello {
int hello (HelloReq req, out HelloRsp rsp);
};
};
</code>The
structrule parses a list of members, each consisting of an index, a keyword (require/optional), a type, an identifier, and a semicolon.
<code>StructDefinition = "struct" _+ id:Identifier _* "{" _* members:MemberDeclaration+ _* "}" _* ";" { return {id, type:"struct", members}; }
MemberDeclaration = i:IntegerLiteral _+ key:("require"/"optional") _+ type:TypeSpecifier _+ id:Identifier _* ";" { return {index:i, isRequired:key=="required", id, type}; }
IntegerLiteral = digits:[0] { return parseInt(digits); }
/ head:[1-9] tail:[0-9]* { return parseInt([head, ...tail].join("")); }
Identifier = head:[_a-zA-Z] tail:[_a-zA-Z0-9]* { return [head, ...tail].join(""); }
TypeSpecifier = "void" / "bool" / "string" / "int" / "short" / type:"unsigned" _+ "int" { return type.join(""); }
_ = ([ \t\n\r]) { return ""; }
</code>The
interfacerule parses method definitions with return types, parameter lists, and optional
outparameters.
<code>InterfaceDefinition = "interface" _+ id:Identifier _* "{" _* methods:MethodDeclaration+ _* "}" _* ";" { return {id, type:"interface", methods}; }
MethodDeclaration = returnType:TypeSpecifier _+ id:Identifier _* "(" _* params:ParameterDefinition _* ")" _* ";" { return {id, type:"method", returnType, params}; }
ParameterDefinition = first:SingleParameterDefinition _* "," _* left:ParameterDefinition { return [first, ...left]; }
/ param:SingleParameterDefinition { return [param]; }
SingleParameterDefinition = "out" _+ type:(Identifier/TypeSpecifier) _+ id:Identifier { return {id, io:"out", type}; }
/ _* type:(Identifier/TypeSpecifier) _+ id:Identifier { return {id, io:"", type}; }
</code>Combining the
structand
interfacerules yields a complete PEG.js grammar for JCE files:
<code>jce = module:ModuleDefinition { return module; }
ModuleDefinition = _* "module" _+ id:Identifier _* "{" _* value:ValueDefinition+ _* "}" _* ";" { return {type:"module", id, value}; }
ValueDefinition = StructDefinition / InterfaceDefinition
</code>The final parsing result of the example JCE file is shown in the accompanying diagrams.
References: Intro to Peg.js, documentation#grammar-syntax-and-semantics.
QQ Music Frontend Team
QQ Music Web Frontend Team
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.