10 Common Pitfalls When Streaming JSON in Node.js and Safer Patterns
This guide enumerates ten frequent traps encountered when streaming JSON in Node.js—such as assuming one chunk per object, UTF‑8 split issues, missing newline delimiters, back‑pressure overload, and handling of large numbers—and presents reliable patterns like using NDJSON framing, StringDecoder, pipeline, and proper error handling to avoid data loss and memory spikes.
Streaming JSON in Node is fraught with traps: chunk boundaries, UTF‑8 splits, back‑pressure, and NDJSON edge cases. Below are ten common pitfalls and safer patterns that avoid memory explosions or silent data corruption.
Architecture Flow: Where Streaming JSON Fails
Most data‑processing pipelines look like this:
HTTP response (readable stream)
│ (chunks)
▼
Decode bytes → string
│ (possible partial tokens)
▼
Framework (NDJSON / JSON array / SSE)
│ (record delimiters)
▼
Parse (JSON.parse / streaming parser)
▼
Transform → write / database / queueThe traps occur at the boundaries: byte‑to‑string conversion, framing, back‑pressure, and error handling.
1) Assuming One Chunk Equals One JSON Object
Trap
Typical code:
res.on("data", (chunk) => {
const obj = JSON.parse(chunk); // 💥
});A chunk can contain half a JSON token, multiple objects, or any combination.
Safer Pattern
Frame the stream (NDJSON lines, JSON‑array elements, or a streaming parser). If you control the producer, prefer NDJSON.
2) UTF‑8 Characters Split Across Chunks
Trap
Naïvely converting a buffer to a string can split multi‑byte characters, producing replacement characters or corrupted JSON.
Safer Pattern
Use StringDecoder to safely decode UTF‑8 across chunk boundaries.
import { StringDecoder } from "node:string_decoder";
const decoder = new StringDecoder("utf8");
let buf = "";
stream.on("data", (chunk) => {
buf += decoder.write(chunk);
});
stream.on("end", () => {
buf += decoder.end();
});Combine decoding with framing instead of building a huge buffer.
3) NDJSON "Almost Works" Until a Newline Appears Inside a String
Trap
NDJSON relies on newline as a record delimiter. Real newlines inside JSON strings break the line splitter.
Safer Pattern
Ensure the producer outputs one JSON object per line , compact and valid.
Validate the stream and fail fast on framing corruption.
4) Missing Trailing Newline Causes Last Record Loss
Trap
If your code only parses when it sees a newline, the final object without a trailing \n is silently dropped.
Safer Pattern (NDJSON line splitter)
Keep a remainder buffer and flush it on end.
import { Transform } from "node:stream";
import { StringDecoder } from "node:string_decoder";
class NdjsonParser extends Transform {
constructor() {
super({ readableObjectMode: true });
this.decoder = new StringDecoder("utf8");
this.remainder = "";
}
_transform(chunk, _enc, cb) {
this.remainder += this.decoder.write(chunk);
const lines = this.remainder.split("
");
this.remainder = lines.pop(); // keep last partial line
try {
for (const line of lines) {
if (!line.trim()) continue;
this.push(JSON.parse(line));
}
cb();
} catch (e) {
cb(e);
}
}
_flush(cb) {
try {
const tail = (this.remainder + this.decoder.end()).trim();
if (tail) this.push(JSON.parse(tail));
cb();
} catch (e) {
cb(e);
}
}
}This is the "boring but correct" NDJSON baseline.
5) Ignoring Back‑Pressure Leads to Memory Spikes
Trap
Your parsing runs faster than writing, causing the pipeline to buffer objects in memory.
Safer Pattern
Use pipeline() with proper streams so Node manages back‑pressure.
import { pipeline } from "node:stream/promises";
import { Transform } from "node:stream";
const toDb = new Transform({
objectMode: true,
async transform(obj, _enc, cb) {
try {
await writeToDb(obj); // natural back‑pressure
cb();
} catch (e) {
cb(e);
}
}
});
await pipeline(res, new NdjsonParser(), toDb);Keep the whole path as a single pipeline to avoid accidental buffering of the entire universe.
6) "Streaming a JSON Array" Without a Real Array Parser
Trap
Attempting to read until "]" then JSON.parse(all) is not streaming; it causes delayed OOM.
Safer Pattern
Prefer NDJSON over a massive JSON array for streaming APIs.
If you must keep the array format, use a true streaming JSON parser (SAX‑style) instead of string splitting.
7) BigInt and Silent Numeric Loss
Trap
JavaScript numbers are IEEE‑754 doubles; large integers can exceed the safe range and be rounded silently.
Safer Pattern
If possible, have the producer send large IDs as strings .
If you must parse big integers, use a parser that supports BigInt or post‑process known fields.
A practical compromise is to treat known ID fields as strings in the contract.
8) Partial Failure: Skipping Bad Records Causes Silent Corruption
Trap
Catching JSON.parse errors and ignoring them drops rows without a trace.
Safer Pattern
Decide your failure mode up front:
For critical pipelines, fail fast .
Otherwise, isolate bad records, logging byte offset (or line number), raw payload (truncated), and error code for later debugging.
9) Gzip/Deflate Decompression at the Wrong Position
Trap
Parsing compressed bytes as JSON or decoding to a string before decompression corrupts the data.
Safer Pattern
Detect content-encoding and decode bytes first.
HTTP (bytes) → decompress (bytes) → decode (utf8) → frame → parseNode provides zlib.createGunzip() and similar tools; place them early in the pipeline.
10) Mixing Async Iteration with Event Handlers or Double‑Consuming Streams
Trap
Using both for await … of and stream.on('data') (or mixing pipelines with manual reads) can cause data loss, hangs, or partial reads.
Safer Pattern
Choose a single style: pipeline() + Transform streams for complex flows. for await … of for simple flows with clear back‑pressure.
Example: async iteration over NDJSON safely.
import { StringDecoder } from "node:string_decoder";
async function* ndjsonObjects(readable) {
const decoder = new StringDecoder("utf8");
let rem = "";
for await (const chunk of readable) {
rem += decoder.write(chunk);
const lines = rem.split("
");
rem = lines.pop();
for (const line of lines) {
if (line.trim()) yield JSON.parse(line);
}
}
rem += decoder.end();
if (rem.trim()) yield JSON.parse(rem);
}Standardizable Safe Defaults for Codebases
If you are building internal tools, consider these good defaults:
Prefer NDJSON as the streaming endpoint.
Always use StringDecoder for decoding.
Use pipeline() to manage back‑pressure and error propagation.
Set a maximum record size to prevent gigantic lines.
Maintain a clear contract : one object per line, UTF‑8 encoded, newline‑delimited, no pretty‑printed JSON.
Simple but important size guard:
const MAX_LINE = 1_000_000; // 1 MB
if (this.remainder.length > MAX_LINE) {
throw new Error("NDJSON line too large");
}Conclusion
Streaming JSON in Node is hard not because Node lacks features, but because JSON was never designed for chunked transport and many “helpful” assumptions leak into code.
The safe path is surprisingly consistent: correct decoding → explicit framing → careful parsing → respect back‑pressure → safe failure .
Code Mala Tang
Read source code together, write articles together, and enjoy spicy hot pot together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
