Understanding Abstract Syntax Trees (AST) and Their Applications in JavaScript Tooling
This article explains what an Abstract Syntax Tree (AST) is, how JavaScript code is parsed into ASTs, the processes of lexical and syntactic analysis, and demonstrates practical AST manipulation using tools like Esprima, Estraverse, Escodegen, and Babel to transform code such as renaming functions, converting arrow functions, and implementing on‑demand imports.
What is AST
Abstract Syntax Tree ( Abstract Syntax Tree , abbreviated AST ) is a tree‑structured representation of the abstract syntactic structure of source code. Many tools and libraries such as webpack and eslint rely on the concept of an abstract syntax tree to perform code checking, analysis, and other operations. This article shares the concept of AST for interpreted languages like JavaScript.
Browsers typically convert JavaScript code into an AST before performing further analysis and execution, so turning JavaScript into an AST facilitates program analysis.
In the picture above, a variable declaration statement is shown; after conversion to AST it appears as the right‑hand diagram.
In the left diagram:
var is a keyword
AST is a declarator
= is the equal sign (there are many forms, which will be seen later)
is tree is a string
; is a semicolon
When a piece of code is converted to an AST, the top‑level object has a type property with value Program ; the second property is body , which is an array.
The body array contains objects, each describing a statement.
type: // describes the statement type – variable declaration
kind: // keyword of the declaration – var
declaration: // array of declaration contents, each also an object
type: // describes the statement type
id: // object describing the variable name
type: // identifier
name: // variable name
init: // object describing the initialization value
type: // type
value: "is tree" // without quotes
raw: "\"is tree\"" // with quotesLexical Analysis and Syntax Analysis
JavaScript is an interpreted language; generally the process is lexical analysis → syntax analysis → AST, after which execution can begin.
Lexical analysis (also called scanning ) converts a character stream into a token stream ( tokens ). It reads the code and, according to certain rules, groups characters into identifiers.
For example, the code var a = 2 is usually broken down into the tokens var , a , = , 2 .
[
{ type: 'Keyword', value: 'var' },
{ type: 'Identifier', value: 'a' },
{ type: 'Punctuator', value: '=' },
{ type: 'Numeric', value: '2' },
]During lexical analysis the source code is read character by character, which is why it is often called scanning – scans . When it encounters spaces, operators, or special symbols, it treats the current sequence as a complete token.
Syntax analysis (also called a parser ) converts the token array into a tree structure and validates the syntax. If there is a syntax error, an exception is thrown.
{
...
"type": "VariableDeclarator",
"id": {
"type": "Identifier",
"name": "a"
},
...
}You can view the AST generated by syntax analysis online at http://esprima.org .
What Can AST Do
Syntax checking, code style checking, code formatting, syntax highlighting, error hints, auto‑completion, etc.
Code obfuscation and compression
Optimizing and restructuring code
For example, given a function function a() {} , you may want to change it to function b() {} .
In webpack , after code compilation require('a') becomes __webpack_require__("*/**/a.js") .
Below is a set of tools that can convert code to an AST, modify nodes, and generate new code.
AST Parsing Process
Prepare the tools:
esprima – code → AST
estraverse – traverse/transform the AST
escodegen – AST → code
An online AST conversion site is recommended: https://astexplorer.net/ .
For example, the code function getUser() {} can be renamed to hello :
const esprima = require('esprima')
const estraverse = require('estraverse')
const code = `function getUser() {}`
// generate AST
const ast = esprima.parseScript(code)
// traverse AST – only the "type" property is visited
estraverse.traverse(ast, {
enter(node) { console.log('enter -> node.type', node.type) },
leave(node) { console.log('leave -> node.type', node.type) },
})The output result is shown below:
From this we can see that the AST traversal process is depth‑first, as illustrated:
Modify Function Name
When we inspect the AST we find that the function name is stored in a node whose type is Identifier . Modifying that node directly changes the function name.
// transform tree
estraverse.traverse(ast, {
// both enter and leave can modify
enter(node) {
console.log('enter -> node.type', node.type)
if (node.type === 'Identifier') {
node.name = 'hello'
}
},
leave(node) { console.log('leave -> node.type', node.type) },
})
// generate new code
const result = escodegen.generate(ast)
console.log(result) // function hello() {}Babel Working Principle
When AST is mentioned, Babel inevitably comes to mind. Since ES6 became widely used, Babel appeared to solve browser incompatibility with ES6 features by converting ES6 code to ES5, which works in all browsers. Babel’s code transformation relies on AST, establishing a close relationship between Babel and AST.
In Babel, the compilation process uses the configuration files .babelrc or babel.config.js , which contain presets and plugins (among other options).
Difference Between Plugins and Presets
// .babelrc
{
"presets": ["@babel/preset-env"],
"plugins": []
}When presets includes @babel/preset-env , @babel/core will look for the preset’s plugin package, which is a collection of plugins.
The core Babel package does not perform code transformation itself; it only provides core APIs. Real transformation work is done by plugins or presets. For example, to transform arrow functions, the plugin @babel/plugin-transform-arrow-functions is used. When many transformations are needed, a preset (a set of plugins) is more convenient.
Babel Plugin Usage
To transform an arrow function into a normal function:
const babel = require('@babel/core')
const code = `const fn = (a, b) => a + b`
// babel.transform will automatically traverse and apply the appropriate preset/plugin
const r = babel.transform(code, {
presets: ['@babel/preset-env'],
})
console.log(r.code)
// "use strict";
// var fn = function fn() { return a + b; };If only the arrow‑function‑to‑normal‑function feature is needed, you can use the specific plugin directly:
const r = babel.transform(code, {
plugins: ['@babel/plugin-transform-arrow-functions'],
})
console.log(r.code)
// const fn = function () { return a + b; };The result shows that the variable declaration keyword remains const ; only the arrow function is transformed.
Write Your Own Plugin
Now we can write our own plugins to perform custom code transformations, using the AST manipulation logic described earlier.
For example, convert const fn = (a, b) => a + b into const fn = function(a, b) { return a + b } .
Analyze AST Structure
First, inspect the AST of the arrow function and the normal function on the online AST explorer to see the differences.
From the analysis we can conclude:
After conversion the function is no longer an ArrowFunctionExpression but a FunctionExpression .
We need to replace ArrowFunctionExpression nodes with FunctionExpression nodes.
The binary expression inside the arrow function must be placed inside a BlockStatement .
Essentially we are rebuilding a new tree and generating code from it.
Visitor Pattern
When developing Babel plugins, the visitor pattern is used: when a certain node path is visited, you can match it and modify the node. For example, when encountering an ArrowFunctionExpression , we replace it with a normal function.
const babel = require('@babel/core')
const arrowFnPlugin = {
visitor: {
ArrowFunctionExpression(path) {
const node = path.node
console.log('ArrowFunctionExpression -> node', node)
// further processing will replace the node
},
},
}
const r = babel.transform(code, { plugins: [arrowFnPlugin] })
console.log(r)Modify AST Structure
We obtain an ArrowFunctionExpression node and need to replace it with a FunctionExpression . Babel provides @babel/types to help create new nodes.
@babel/types has two main purposes:
Check whether a node is of a certain type (e.g., t.isArrowFunctionExpression(node) ).
Generate corresponding node structures.
To create a FunctionExpression we use:
t.functionExpression(id, params, body, generator, async)id : Identifier (default null )
params : Array<LVal> – function parameters
body : BlockStatement – the function body
generator : boolean (default false )
async : boolean (default false )
We also need a BlockStatement :
t.blockStatement(body, directives)The body of a BlockStatement is an array of statements; in our case it contains a ReturnStatement .
Finally we replace the original node:
const babel = require('@babel/core')
const t = require('@babel/types')
const code = `const fn = (a, b) => a + b`
const arrowFnPlugin = {
visitor: {
ArrowFunctionExpression(path) {
const node = path.node
const params = node.params
const body = node.body
const functionExpression = t.functionExpression(null, params, t.blockStatement([body]))
path.replaceWith(functionExpression)
},
},
}
const r = babel.transform(code, { plugins: [arrowFnPlugin] })
console.log(r.code) // const fn = function(a, b) { return a + b; };Special Cases
Arrow functions can omit the return keyword. The previous plugin works for that case, but if the user writes an explicit return , the plugin needs to handle it as well.
const fn = (a, b) => { return a + b }
// should become
const fn = function(a, b) { return a + b }We adjust the plugin to ensure the body is always a BlockStatement :
ArrowFunctionExpression(path) {
const node = path.node
const params = node.params
let body = node.body
if (!t.isBlockStatement(body)) {
body = t.blockStatement([body])
}
const functionExpression = t.functionExpression(null, params, body)
path.replaceWith(functionExpression)
}On‑Demand Import
In UI frameworks such as Vue, Vant, or React, libraries like element-ui , vant , and antd support both global import and on‑demand import. By default they are globally imported; to switch to on‑demand import you need the babel-plugin-import plugin, which rewrites the import statements.
For example, the Vant import import { Button } from 'vant' becomes import Button from 'vant/lib/Button' , reducing bundle size.
Analyze Import Syntax Tree
import { Button, Icon } from 'vant' is transformed into import Button from 'vant/lib/Button'; import Icon from 'vant/lib/Icon'
The two ASTs differ as follows:
The destructuring import has a single ImportDeclaration with multiple ImportSpecifier s; the transformed version has multiple ImportDeclaration s each with an ImportDefaultSpecifier .
The source field differs: the original points to the package, the transformed points to the specific file.
Analyze Types
We need to generate multiple ImportDeclaration s. The API is:
t.importDeclaration(specifiers, source)Each ImportDeclaration requires an ImportDefaultSpecifier and a StringLiteral for the source:
t.importDefaultSpecifier(local) t.stringLiteral(value)Write the Plugin
const babel = require('@babel/core')
const t = require('@babel/types')
const code = `import { Button, Icon } from 'vant'`
function importPlugin(opt) {
const { libraryDir } = opt
return {
visitor: {
ImportDeclaration(path) {
const node = path.node
const specifiers = node.specifiers
if (!(specifiers.length === 1 && t.isImportDefaultSpecifier(specifiers[0]))) {
const result = specifiers.map(specifier => {
const local = specifier.local
const source = t.stringLiteral(`${node.source.value}/${libraryDir}/${specifier.local.name}`)
return t.importDeclaration([t.importDefaultSpecifier(local)], source)
})
path.replaceWithMultiple(result)
}
},
},
}
}
const r = babel.transform(code, { plugins: [importPlugin({ libraryDir: 'lib' })] })
console.log(r.code)The transformation works as expected:
Special Case
If the user writes a mixed import such as import vant, { Button, Icon } from 'vant' , the plugin must keep the default import unchanged and only transform the named specifiers.
function importPlugin(opt) {
const { libraryDir } = opt
return {
visitor: {
ImportDeclaration(path) {
const node = path.node
const specifiers = node.specifiers
if (!(specifiers.length === 1 && t.isImportDefaultSpecifier(specifiers[0]))) {
const result = specifiers.map(specifier => {
let local = specifier.local
let source
if (t.isImportDefaultSpecifier(specifier)) {
source = t.stringLiteral(node.source.value)
} else {
source = t.stringLiteral(`${node.source.value}/${libraryDir}/${specifier.local.name}`)
}
return t.importDeclaration([t.importDefaultSpecifier(local)], source)
})
path.replaceWithMultiple(result)
}
},
},
}
}babylon
Babylon is a JavaScript parser used in Babel.
babylon and Babel Relationship
Babel uses the engine babylon , which is a fork of the acorn project. acorn provides basic parsing to an AST; traversal and node replacement require additional packages such as acorn-traverse . Babel integrates these capabilities into a unified plugin system.
Using babylon
We will write a plugin that converts array spread syntax to ES5 code:
Convert const arr = [ ...arr1, ...arr2 ] into var arr = [].concat(arr1, arr2) using babylon , @babel/traverse , @babel/generator , and @babel/types .
Analyze Syntax Tree
Observations:
Both trees are variable declarations, but the declaration keyword differs ( const vs var ).
The initializer differs: one is an ArrayExpression , the other a CallExpression .
Therefore we only need to replace the array expression with a call expression.
Analyze Types
We need to generate a CallExpression :
t.callExpression(callee, arguments)The callee is a MemberExpression (e.g., [].concat ), which is created with:
t.memberExpression(object, property, computed, optional)The object is an empty ArrayExpression :
t.arrayExpression(elements)Finally we need to create a VariableDeclarator and a VariableDeclaration :
t.variableDeclarator(id, init)
t.variableDeclaration(kind, declarations)Write the Transformation
const babylon = require('babylon')
const traverse = require('@babel/traverse').default
const generator = require('@babel/generator').default
const t = require('@babel/types')
const code = `const arr = [ ...arr1, ...arr2 ]`
const ast = babylon.parse(code, { sourceType: 'module' })
traverse(ast, {
VariableDeclaration(path) {
const node = path.node
const declarations = node.declarations
const kind = 'var'
if (node.kind !== kind && declarations.length === 1 && t.isArrayExpression(declarations[0].init)) {
const args = declarations[0].init.elements.map(item => item.argument)
const callee = t.memberExpression(t.arrayExpression(), t.identifier('concat'), false)
const init = t.callExpression(callee, args)
const declaration = t.variableDeclarator(declarations[0].id, init)
const variableDeclaration = t.variableDeclaration(kind, [declaration])
path.replaceWith(variableDeclaration)
}
},
})
const output = generator(ast).code
console.log(output) // var arr = [].concat(arr1, arr2)Concrete Syntax Tree
The counterpart of an AST is a Concrete Syntax Tree (CST), also known as a parse tree. In compilation, a parser creates a CST; later phases such as semantic analysis may augment the tree. See the differences between AST and CST for more details.
Supplement
Below is a non‑exhaustive list of node types used by Babel:
(parameter) node: Identifier | SimpleLiteral | RegExpLiteral | Program | FunctionDeclaration | FunctionExpression | ArrowFunctionExpression | SwitchCase | CatchClause | VariableDeclarator | ExpressionStatement | BlockStatement | EmptyStatement | DebuggerStatement | WithStatement | ReturnStatement | LabeledStatement | BreakStatement | ContinueStatement | IfStatement | SwitchStatement | ThrowStatement | TryStatement | WhileStatement | DoWhileStatement | ForStatement | ForInStatement | ForOfStatement | VariableDeclaration | ClassDeclaration | ThisExpression | ArrayExpression | ObjectExpression | YieldExpression | UnaryExpression | UpdateExpression | BinaryExpression | AssignmentExpression | LogicalExpression | MemberExpression | ConditionalExpression | SimpleCallExpression | NewExpression | SequenceExpression | TemplateLiteral | TaggedTemplateExpression | ClassExpression | MetaProperty | AwaitExpression | Property | AssignmentProperty | Super | TemplateElement | SpreadElement | ObjectPattern | ArrayPattern | RestElement | AssignmentPattern | ClassBody | MethodDefinition | ImportDeclaration | ExportNamedDeclaration | ExportDefaultDeclaration | ExportAllDeclaration | ImportSpecifier | ImportDefaultSpecifier | ImportNamespaceSpecifier | ExportSpecifierBabel’s documentation provides a detailed definition of the AST tree.
Source Code Repository
The code is stored on GitHub: https://github.com/fecym/ast-share
Reference Links
JavaScript syntax parsing, AST, V8, JIT
Detailed explanation of AST
AST article (includes class to ES5 constructor conversion)
Analysis of Babel – Babel Overview | AlloyTeam
@babel/types documentation
Sohu Tech Products
A knowledge-sharing platform for Sohu's technology products. As a leading Chinese internet brand with media, video, search, and gaming services and over 700 million users, Sohu continuously drives tech innovation and practice. We’ll share practical insights and tech news here.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.