Frontend Development 31 min read

Understanding Abstract Syntax Trees (AST) and Their Applications in JavaScript Tooling

This article explains what an Abstract Syntax Tree (AST) is, how JavaScript code is parsed into ASTs, the processes of lexical and syntactic analysis, and demonstrates practical AST manipulation using tools like Esprima, Estraverse, Escodegen, and Babel to transform code such as renaming functions, converting arrow functions, and implementing on‑demand imports.

Sohu Tech Products

Jun 24, 2020

Understanding Abstract Syntax Trees (AST) and Their Applications in JavaScript Tooling

What is AST

Abstract Syntax Tree ( Abstract Syntax Tree, abbreviated AST) is a tree‑structured representation of the abstract syntactic structure of source code. Many tools and libraries such as webpack and eslint rely on the concept of an abstract syntax tree to perform code checking, analysis, and other operations. This article shares the concept of AST for interpreted languages like JavaScript.

Browsers typically convert JavaScript code into an AST before performing further analysis and execution, so turning JavaScript into an AST facilitates program analysis.

In the picture above, a variable declaration statement is shown; after conversion to AST it appears as the right‑hand diagram.

In the left diagram: var is a keyword AST is a declarator = is the equal sign (there are many forms, which will be seen later) is tree is a string ; is a semicolon

When a piece of code is converted to an AST, the top‑level object has a type property with value Program; the second property is body, which is an array.

The body array contains objects, each describing a statement.

type:          // describes the statement type – variable declaration
kind:          // keyword of the declaration – var
declaration:   // array of declaration contents, each also an object
  type:        // describes the statement type
  id:          // object describing the variable name
    type:      // identifier
    name:      // variable name
  init:        // object describing the initialization value
    type:      // type
    value: "is tree" // without quotes
    raw: "\"is tree\"" // with quotes

Lexical Analysis and Syntax Analysis

JavaScript

is an interpreted language; generally the process is lexical analysis → syntax analysis → AST, after which execution can begin.

Lexical analysis (also called scanning) converts a character stream into a token stream ( tokens). It reads the code and, according to certain rules, groups characters into identifiers.

For example, the code var a = 2 is usually broken down into the tokens var, a, =, 2.

[
  { type: 'Keyword', value: 'var' },
  { type: 'Identifier', value: 'a' },
  { type: 'Punctuator', value: '=' },
  { type: 'Numeric', value: '2' },
]

During lexical analysis the source code is read character by character, which is why it is often called scanning – scans. When it encounters spaces, operators, or special symbols, it treats the current sequence as a complete token.

Syntax analysis (also called a parser) converts the token array into a tree structure and validates the syntax. If there is a syntax error, an exception is thrown.

{
  ...
  "type": "VariableDeclarator",
  "id": {
    "type": "Identifier",
    "name": "a"
  },
  ...
}

You can view the AST generated by syntax analysis online at http://esprima.org .

What Can AST Do

Syntax checking, code style checking, code formatting, syntax highlighting, error hints, auto‑completion, etc.

Code obfuscation and compression

Optimizing and restructuring code

For example, given a function function a() {}, you may want to change it to function b() {}.

In webpack, after code compilation require('a') becomes __webpack_require__("*/**/a.js").

Below is a set of tools that can convert code to an AST, modify nodes, and generate new code.

AST Parsing Process

Prepare the tools:

esprima – code → AST

estraverse – traverse/transform the AST

escodegen – AST → code

An online AST conversion site is recommended: https://astexplorer.net/ .

For example, the code function getUser() {} can be renamed to hello:

const esprima = require('esprima')
const estraverse = require('estraverse')
const code = `function getUser() {}`
// generate AST
const ast = esprima.parseScript(code)
// traverse AST – only the "type" property is visited
estraverse.traverse(ast, {
  enter(node) { console.log('enter -> node.type', node.type) },
  leave(node) { console.log('leave -> node.type', node.type) },
})

The output result is shown below:

From this we can see that the AST traversal process is depth‑first, as illustrated:

Modify Function Name

When we inspect the AST we find that the function name is stored in a node whose type is Identifier. Modifying that node directly changes the function name.

// transform tree
estraverse.traverse(ast, {
  // both enter and leave can modify
  enter(node) {
    console.log('enter -> node.type', node.type)
    if (node.type === 'Identifier') {
      node.name = 'hello'
    }
  },
  leave(node) { console.log('leave -> node.type', node.type) },
})
// generate new code
const result = escodegen.generate(ast)
console.log(result) // function hello() {}

Babel Working Principle

When AST is mentioned, Babel inevitably comes to mind. Since ES6 became widely used, Babel appeared to solve browser incompatibility with ES6 features by converting ES6 code to ES5, which works in all browsers. Babel’s code transformation relies on AST, establishing a close relationship between Babel and AST.

In Babel, the compilation process uses the configuration files .babelrc or babel.config.js, which contain presets and plugins (among other options).

Difference Between Plugins and Presets

// .babelrc
{
  "presets": ["@babel/preset-env"],
  "plugins": []
}

When presets includes @babel/preset-env, @babel/core will look for the preset’s plugin package, which is a collection of plugins.

The core Babel package does not perform code transformation itself; it only provides core APIs. Real transformation work is done by plugins or presets. For example, to transform arrow functions, the plugin @babel/plugin-transform-arrow-functions is used. When many transformations are needed, a preset (a set of plugins) is more convenient.

Babel Plugin Usage

To transform an arrow function into a normal function:

const babel = require('@babel/core')
const code = `const fn = (a, b) => a + b`
// babel.transform will automatically traverse and apply the appropriate preset/plugin
const r = babel.transform(code, {
  presets: ['@babel/preset-env'],
})
console.log(r.code)
// "use strict";
// var fn = function fn() { return a + b; };

If only the arrow‑function‑to‑normal‑function feature is needed, you can use the specific plugin directly:

const r = babel.transform(code, {
  plugins: ['@babel/plugin-transform-arrow-functions'],
})
console.log(r.code)
// const fn = function () { return a + b; };

The result shows that the variable declaration keyword remains const; only the arrow function is transformed.

Write Your Own Plugin

Now we can write our own plugins to perform custom code transformations, using the AST manipulation logic described earlier.

For example, convert const fn = (a, b) => a + b into const fn = function(a, b) { return a + b }.

Analyze AST Structure

First, inspect the AST of the arrow function and the normal function on the online AST explorer to see the differences.

From the analysis we can conclude:

After conversion the function is no longer an ArrowFunctionExpression but a FunctionExpression.

We need to replace ArrowFunctionExpression nodes with FunctionExpression nodes.

The binary expression inside the arrow function must be placed inside a BlockStatement.

Essentially we are rebuilding a new tree and generating code from it.

Visitor Pattern

When developing Babel plugins, the visitor pattern is used: when a certain node path is visited, you can match it and modify the node. For example, when encountering an ArrowFunctionExpression, we replace it with a normal function.

const babel = require('@babel/core')
const arrowFnPlugin = {
  visitor: {
    ArrowFunctionExpression(path) {
      const node = path.node
      console.log('ArrowFunctionExpression -> node', node)
      // further processing will replace the node
    },
  },
}
const r = babel.transform(code, { plugins: [arrowFnPlugin] })
console.log(r)

Modify AST Structure

We obtain an ArrowFunctionExpression node and need to replace it with a FunctionExpression. Babel provides @babel/types to help create new nodes. @babel/types has two main purposes:

Check whether a node is of a certain type (e.g., t.isArrowFunctionExpression(node)).

Generate corresponding node structures.

To create a FunctionExpression we use:

t.functionExpression(id, params, body, generator, async)

id

: Identifier (default null) params: Array<LVal> – function parameters body: BlockStatement – the function body generator: boolean (default false) async: boolean (default false)

We also need a BlockStatement:

t.blockStatement(body, directives)

The body of a BlockStatement is an array of statements; in our case it contains a ReturnStatement.

Finally we replace the original node:

const babel = require('@babel/core')
const t = require('@babel/types')
const code = `const fn = (a, b) => a + b`
const arrowFnPlugin = {
  visitor: {
    ArrowFunctionExpression(path) {
      const node = path.node
      const params = node.params
      const body = node.body
      const functionExpression = t.functionExpression(null, params, t.blockStatement([body]))
      path.replaceWith(functionExpression)
    },
  },
}
const r = babel.transform(code, { plugins: [arrowFnPlugin] })
console.log(r.code) // const fn = function(a, b) { return a + b; };

Special Cases

Arrow functions can omit the return keyword. The previous plugin works for that case, but if the user writes an explicit return, the plugin needs to handle it as well.

const fn = (a, b) => { return a + b }
// should become
const fn = function(a, b) { return a + b }

We adjust the plugin to ensure the body is always a BlockStatement:

ArrowFunctionExpression(path) {
  const node = path.node
  const params = node.params
  let body = node.body
  if (!t.isBlockStatement(body)) {
    body = t.blockStatement([body])
  }
  const functionExpression = t.functionExpression(null, params, body)
  path.replaceWith(functionExpression)
}

On‑Demand Import

In UI frameworks such as Vue, Vant, or React, libraries like element-ui, vant, and antd support both global import and on‑demand import. By default they are globally imported; to switch to on‑demand import you need the babel-plugin-import plugin, which rewrites the import statements.

For example, the Vant import import { Button } from 'vant' becomes import Button from 'vant/lib/Button', reducing bundle size.

Analyze Import Syntax Tree

import { Button, Icon } from 'vant' is transformed into import Button from 'vant/lib/Button'; import Icon from 'vant/lib/Icon'

The two ASTs differ as follows:

The destructuring import has a single ImportDeclaration with multiple ImportSpecifier s; the transformed version has multiple ImportDeclaration s each with an ImportDefaultSpecifier.

The source field differs: the original points to the package, the transformed points to the specific file.

Analyze Types

We need to generate multiple ImportDeclaration s. The API is:

t.importDeclaration(specifiers, source)

Each ImportDeclaration requires an ImportDefaultSpecifier and a StringLiteral for the source:

t.importDefaultSpecifier(local)

t.stringLiteral(value)

Write the Plugin

const babel = require('@babel/core')
const t = require('@babel/types')
const code = `import { Button, Icon } from 'vant'`
function importPlugin(opt) {
  const { libraryDir } = opt
  return {
    visitor: {
      ImportDeclaration(path) {
        const node = path.node
        const specifiers = node.specifiers
        if (!(specifiers.length === 1 && t.isImportDefaultSpecifier(specifiers[0]))) {
          const result = specifiers.map(specifier => {
            const local = specifier.local
            const source = t.stringLiteral(`${node.source.value}/${libraryDir}/${specifier.local.name}`)
            return t.importDeclaration([t.importDefaultSpecifier(local)], source)
          })
          path.replaceWithMultiple(result)
        }
      },
    },
  }
}
const r = babel.transform(code, { plugins: [importPlugin({ libraryDir: 'lib' })] })
console.log(r.code)

The transformation works as expected:

Special Case

If the user writes a mixed import such as import vant, { Button, Icon } from 'vant', the plugin must keep the default import unchanged and only transform the named specifiers.

function importPlugin(opt) {
  const { libraryDir } = opt
  return {
    visitor: {
      ImportDeclaration(path) {
        const node = path.node
        const specifiers = node.specifiers
        if (!(specifiers.length === 1 && t.isImportDefaultSpecifier(specifiers[0]))) {
          const result = specifiers.map(specifier => {
            let local = specifier.local
            let source
            if (t.isImportDefaultSpecifier(specifier)) {
              source = t.stringLiteral(node.source.value)
            } else {
              source = t.stringLiteral(`${node.source.value}/${libraryDir}/${specifier.local.name}`)
            }
            return t.importDeclaration([t.importDefaultSpecifier(local)], source)
          })
          path.replaceWithMultiple(result)
        }
      },
    },
  }
}

babylon

Babylon is a JavaScript parser used in Babel.

babylon and Babel Relationship

Babel uses the engine babylon, which is a fork of the acorn project. acorn provides basic parsing to an AST; traversal and node replacement require additional packages such as acorn-traverse. Babel integrates these capabilities into a unified plugin system.

Using babylon

We will write a plugin that converts array spread syntax to ES5 code:

Convert const arr = [ ...arr1, ...arr2 ] into var arr = [].concat(arr1, arr2) using babylon, @babel/traverse, @babel/generator, and @babel/types.

Analyze Syntax Tree

Observations:

Both trees are variable declarations, but the declaration keyword differs ( const vs var).

The initializer differs: one is an ArrayExpression, the other a CallExpression.

Therefore we only need to replace the array expression with a call expression.

Analyze Types

We need to generate a CallExpression:

t.callExpression(callee, arguments)

The callee is a MemberExpression (e.g., [].concat), which is created with:

t.memberExpression(object, property, computed, optional)

The object is an empty ArrayExpression:

t.arrayExpression(elements)

Finally we need to create a VariableDeclarator and a VariableDeclaration:

t.variableDeclarator(id, init)

t.variableDeclaration(kind, declarations)

Write the Transformation

const babylon = require('babylon')
const traverse = require('@babel/traverse').default
const generator = require('@babel/generator').default
const t = require('@babel/types')

const code = `const arr = [ ...arr1, ...arr2 ]`
const ast = babylon.parse(code, { sourceType: 'module' })

traverse(ast, {
  VariableDeclaration(path) {
    const node = path.node
    const declarations = node.declarations
    const kind = 'var'
    if (node.kind !== kind && declarations.length === 1 && t.isArrayExpression(declarations[0].init)) {
      const args = declarations[0].init.elements.map(item => item.argument)
      const callee = t.memberExpression(t.arrayExpression(), t.identifier('concat'), false)
      const init = t.callExpression(callee, args)
      const declaration = t.variableDeclarator(declarations[0].id, init)
      const variableDeclaration = t.variableDeclaration(kind, [declaration])
      path.replaceWith(variableDeclaration)
    }
  },
})

const output = generator(ast).code
console.log(output) // var arr = [].concat(arr1, arr2)

Concrete Syntax Tree

The counterpart of an AST is a Concrete Syntax Tree (CST), also known as a parse tree. In compilation, a parser creates a CST; later phases such as semantic analysis may augment the tree. See the differences between AST and CST for more details.

Supplement

Below is a non‑exhaustive list of node types used by Babel:

(parameter) node: Identifier | SimpleLiteral | RegExpLiteral | Program | FunctionDeclaration | FunctionExpression | ArrowFunctionExpression | SwitchCase | CatchClause | VariableDeclarator | ExpressionStatement | BlockStatement | EmptyStatement | DebuggerStatement | WithStatement | ReturnStatement | LabeledStatement | BreakStatement | ContinueStatement | IfStatement | SwitchStatement | ThrowStatement | TryStatement | WhileStatement | DoWhileStatement | ForStatement | ForInStatement | ForOfStatement | VariableDeclaration | ClassDeclaration | ThisExpression | ArrayExpression | ObjectExpression | YieldExpression | UnaryExpression | UpdateExpression | BinaryExpression | AssignmentExpression | LogicalExpression | MemberExpression | ConditionalExpression | SimpleCallExpression | NewExpression | SequenceExpression | TemplateLiteral | TaggedTemplateExpression | ClassExpression | MetaProperty | AwaitExpression | Property | AssignmentProperty | Super | TemplateElement | SpreadElement | ObjectPattern | ArrayPattern | RestElement | AssignmentPattern | ClassBody | MethodDefinition | ImportDeclaration | ExportNamedDeclaration | ExportDefaultDeclaration | ExportAllDeclaration | ImportSpecifier | ImportDefaultSpecifier | ImportNamespaceSpecifier | ExportSpecifier

Babel’s documentation provides a detailed definition of the AST tree.

Source Code Repository

The code is stored on GitHub: https://github.com/fecym/ast-share

Reference Links

JavaScript syntax parsing, AST, V8, JIT

Detailed explanation of AST

AST article (includes class to ES5 constructor conversion)

Analysis of Babel – Babel Overview | AlloyTeam

@babel/types documentation

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

AST Parsing Babel code transformation

Written by

Sohu Tech Products

A knowledge-sharing platform for Sohu's technology products. As a leading Chinese internet brand with media, video, search, and gaming services and over 700 million users, Sohu continuously drives tech innovation and practice. We’ll share practical insights and tech news here.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.