Frontend Development 15 min read

Understanding ECMAScript Grammars: Lexical and Syntactic Rules and the Disallowance of await as an Identifier

This article explains how the ECMAScript specification defines four context‑free grammars—lexical, syntactic, RegExp, and numeric string—illustrates ambiguities such as the '/' token and template literals, and shows how static semantics forbid using the await keyword as an identifier inside async functions while allowing it elsewhere.

ByteFE

Apr 22, 2021

ECMAScript Grammars

The ECMAScript spec defines four grammars: the lexical grammar (translating Unicode code points into input elements), the syntactic grammar (defining how tokens form valid programs), the RegExp grammar (defining regular expressions), and the numeric string grammar (converting strings to numbers). Each grammar is expressed as a context‑free grammar with productions.

Lexical Grammar

The source text is a sequence of Unicode code points; the lexical grammar tokenises this sequence. Ambiguities arise, for example, when the character / can be a division operator ( DivPunctuator) or the start of a regular‑expression literal ( RegularExpressionLiteral), depending on the surrounding context: const x = 10 / 5; Here / is a division operator. In contrast: const r = /foo/; Here / begins a regular‑expression literal. Similar context‑dependent parsing applies to template literals, where the sequence }` can be a TemplateTail or part of a TemplateHead depending on its position.

The lexical grammar uses goal symbols such as InputElementDiv and InputElementRegExp to decide which tokens are permitted. For example, InputElementDiv allows DivPunctuator but not RegularExpressionLiteral, whereas InputElementRegExp permits the opposite.

Syntactic Grammar

The syntactic grammar builds on the lexical grammar, defining how tokens combine into syntactically correct programs. As an example, introducing a new keyword like await must not break existing code that used the word as an identifier. function old() { var await; } In async functions, await is a keyword, so the same code becomes a syntax error:

async function modern() { var await; // Syntax error }

To handle this, the spec uses parameterised productions (e.g., VariableStatement[Yield, Await]) and static semantics. The static‑semantic rule for BindingIdentifier states that a production with an [Await] parameter and the string value "await" is a syntax error, preventing await from being used as an identifier inside async functions.

Static Semantics and Identifier Names

Static semantics are applied before execution to enforce early errors. They also resolve cases where the identifier’s string value is formed via Unicode escapes, such as \u0061wait, which yields the string "await" but is not recognised as a keyword by the lexical grammar; static semantics still forbid its use in async functions.

function old() { var \u0061wait; } // allowed

async function modern() { var \u0061wait; // Syntax error }

Summary

The article familiarises the reader with ECMAScript’s lexical and syntactic grammars, demonstrates context‑sensitive tokenisation (e.g., the '/' and template literal cases), and explains how static semantics enforce that await cannot be used as an identifier inside async functions while remaining valid elsewhere.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

ECMAScript await Syntactic

Written by

ByteFE

Cutting‑edge tech, article sharing, and practical insights from the ByteDance frontend team.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.