Backend Development 15 min read

How to Slash Node.js Serverless Startup Time Below 100 ms

This article examines why Node.js serverless functions often exceed the desired 100 ms cold‑start target, analyzes profiling data to pinpoint costly file I/O and compilation steps, and presents practical optimizations such as module bundling, V8 code‑caching, and custom require handling to dramatically reduce startup latency.

Taobao Frontend Technology
Taobao Frontend Technology
Taobao Frontend Technology
How to Slash Node.js Serverless Startup Time Below 100 ms

When deploying Node.js applications, developers rarely focus on the time it takes to start the process; typical startups finish in about five minutes, involving many interactions with corporate systems, which seems acceptable.

However, in the era of serverless, Node.js serverless‑runtime is a cornerstone of modern development, promising elasticity, efficiency, and cost‑effectiveness. If a Node.js FaaS still requires minute‑level deployment, it cannot respond quickly to requests, especially under burst traffic, negating those advantages.

All platforms offering Node.js FaaS strive to shorten cold‑ and warm‑start times. Beyond infrastructure optimizations, the Node.js function itself must also contribute to this effort.

The total time from request receipt to a ready container should be under 500 ms, with the function runtime portion targeted at 100 ms—coincidentally the human reaction limit.

How Fast Is Node.js?

We often assume Node.js is fast; a simple

console.log

seems instantaneous. How fast is it really?

<code>// console.js
console.log(process.uptime() * 1000);
</code>

On Node.js LTS v10.16.0 on a personal workstation:

<code>node console.js
// average time: 86 ms
time node console.js
// node console.js  0.08s user 0.03s system 92% cpu 0.114 total
</code>

Thus, with a 100 ms budget, little time remains for additional code loading. In a typical function platform container:

<code>node console.js
// average time: 170 ms
real    0m0.177s
user    0m0.051s
sys     0m0.009s
</code>

Introducing a module, e.g.,

serverless-runtime

:

<code>// require.js
console.time('load');
require('serverless-runtime');
console.timeEnd('load');
</code>

Local environment:

<code>node require.js
// average time: 329 ms
</code>

Server environment:

<code>node require.js
// average time: 1433 ms
</code>

Loading the function runtime therefore takes about 1.7 seconds, far exceeding the 100 ms goal.

Why Is It Slow?

Profiling with Node.js’s built‑in

--prof

tool reveals the bottlenecks.

<code>node --prof require.js
node --prof-process isolate-xxx-v8.log > result
</code>
<code>[Summary]:
   ticks  total  nonlib   name
     60   13.7%   13.8%  JavaScript
    371   84.7%   85.5%  C++
     10    2.3%    2.3%  GC
      4    0.9%          Shared libraries
      3    0.7%          Unaccounted
[C++]:
   ticks  total  nonlib   name
    198   45.2%   45.6%  node::contextify::ContextifyScript::New(...)
     13    3.0%    3.0%  node::fs::InternalModuleStat(...)
      8    1.8%    1.8%  node::Buffer::StringSlice(...)
      5    1.1%    1.2%  node::GetBinding(...)
      4    0.9%    0.9%  __memmove_ssse3_back
      4    0.9%    0.9%  __GI_mprotect
      3    0.7%    0.7%  v8::internal::StringTable::LookupStringIfExists_NoAllocate(...)
      3    0.7%    0.7%  v8::internal::Scavenger::ScavengeObject(...)
      3    0.7%    0.7%  node::fs::Open(...)
</code>

Running the same profiling on the runtime startup shows a similar pattern, with the majority of time spent in C++ functions such as

Open

,

ContextifyContext

, and

CompileFunction

, all triggered during

require

operations. Therefore, optimizing

require

is the first target.

How to Speed It Up

The two main contributors to startup latency are file I/O and code compilation.

File I/O

File I/O occurs during module lookup and module content reading. Module lookup involves probing directories for files, causing many

open

calls, especially with complex dependency trees. Reading large module files also adds overhead.

Flattening dependencies into a single bundle (as front‑end code does) can reduce I/O. Using the community tool

ncc

to bundle

serverless-runtime

yields:

<code>ncc build node_modules/serverless-runtime/src/index.ts
node require.js
// average load time: 934 ms
</code>

This improves speed by about 34 %.

However, bundling everything can increase the bundle size, leading to slower loads for large modules, as shown in the following test:

<code>import * as _ from 'lodash';
import * as Sequelize from 'sequelize';
import * as Pandorajs from 'pandora';
console.log('lodash: ', _);
console.log('Sequelize: ', Sequelize);
console.log('Pandorajs: ', Pandorajs);
</code>

Benchmark results (image):

The bundled version actually took longer because the single file grew large, highlighting the need for tree‑shaking when using

ncc

.

Code Compilation

Beyond I/O, compiling JavaScript to V8 bytecode is costly. Since many modules are static, caching compiled code can avoid repeated compilation. V8 introduced

VM.Script

cachedData in Node.js v5.7.0, allowing reuse of compiled scripts across runs.

<code>// Use v8-compile-cache to generate cache locally, then deploy
node require.js
// average time: 868 ms
</code>

This yields roughly a 40 % speed boost, though file‑lookup I/O remains.

Advanced Idea

By modifying the

require

function to load modules directly from a pre‑generated cache, we could eliminate the lookup phase entirely, completing all module loads with a single I/O operation. This approach may affect remote debugging and source indexing and requires further investigation.

Upcoming Plans

We intend to apply the proven optimizations—

ncc

bundling, code cache, and the custom

require

hack—in production, balancing speed gains with maintainability.

We will also review the overall function runtime design and business logic to eliminate unnecessary delays.

Node.js 12 introduces default code‑caching for internal modules, reducing startup to around 120 ms in server environments, which we plan to adopt.

Future Considerations

V8’s snapshot feature can further accelerate startup, as used in NW.js and Electron. Node.js 12.6 enables user‑code snapshots, offering a modest 10‑15 % improvement. Packaging the function runtime as a snapshot is a potential avenue, though results are still uncertain.

For Java functions, GraalVM can achieve sub‑10 ms cold starts at the cost of some language features. Compiling the entire runtime to LLVM IR and then to native code is another research direction, albeit a challenging one.

Conclusion

Achieving a 100 ms function runtime startup is an exciting and demanding goal; contributions and ideas from the community are welcome.

performance optimizationserverlessNode.jscold startModule BundlingCode Cache
Taobao Frontend Technology
Written by

Taobao Frontend Technology

The frontend landscape is constantly evolving, with rapid innovations across familiar languages. Like us, your understanding of the frontend is continually refreshed. Join us on Taobao, a vibrant, all‑encompassing platform, to uncover limitless potential.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.