Mastering LangGraph Streaming: Token, Node, and Event-Level Output to Prevent UI Crashes

The article explains why streaming output is essential for responsive LLM agents, compares batch and streaming latency, details the five LangGraph streamMode options with code examples, shows how to combine them, and lists common pitfalls to avoid runtime errors and poor user experience.

James' Growth Diary
James' Growth Diary
James' Growth Diary
Mastering LangGraph Streaming: Token, Node, and Event-Level Output to Prevent UI Crashes

Why streaming output matters

When the invoke method is used, the UI stays blank for the whole response time, causing user anxiety. Streaming reduces the first‑character delay from seconds to sub‑second, moving the perceived wait from the "attention drifts" range (1‑10 s) to the "acceptable delay" range (0.1‑1 s) according to Nielsen's perception latency study.

invoke batch mode causing UI crash
invoke batch mode causing UI crash

The .stream() API returns an AsyncGenerator, allowing consumption of data while it is being generated, unlike .invoke() which returns only the final result.

AsyncGenerator vs invoke API
AsyncGenerator vs invoke API

Stream mode options

values

– emits the full state after each step; useful for debugging and inspecting state accumulation. updates – emits only the incremental changes (which node changed what); ideal for monitoring node‑level progress. messages – emits LLM token chunks together with metadata; enables a typewriter‑style UI. custom – emits arbitrary data pushed by a node or tool; suited for tool‑progress updates. debug – emits checkpoint and task events (most detailed); used for deep debugging.

Updates mode – node‑level progress

Example:

for (const chunk of graph.stream(inputs, { streamMode: "updates" })) {
  console.log(chunk);
  // → { plan: { outline: "Outline for: LangGraph Streaming" } }
  // → { write: { article: "Article based on: Outline for: …" } }
}

The output is a plain object keyed by node name, making it obvious which node finished and what it returned.

Messages mode – true typewriter effect

When the LLM is created with streaming: true, each generated token triggers a chunk:

for await (const [messageChunk, metadata] of graph.stream(
  { messages: [new HumanMessage("Explain quantum entanglement in three sentences")] },
  { streamMode: "messages" }
)) {
  if (messageChunk.content) {
    process.stdout.write(messageChunk.content as string);
  }
}

The chunk format is [messageChunk, metadata]. messageChunk.content holds the text; metadata contains fields such as langgraph_node, run_id and tags.

Custom mode – tool execution progress

Tools that are not LLMs can push progress updates via getLangGraphStreamWriter() inside a node or a tool:

const searchDatabase = tool(async ({ query }) => {
  const writer = getLangGraphStreamWriter();
  writer({ type: "progress", message: "Connecting…", progress: 0 });
  await new Promise(r => setTimeout(r, 300));
  writer({ type: "progress", message: "Running query…", progress: 30 });
  // …more steps…
  writer({ type: "progress", message: "Done", progress: 100 });
  return `Found 42 records for "${query}"`;
}, { name: "search_database", description: "Search internal DB", schema: z.object({ query: z.string() }) });

for await (const chunk of graph.stream(inputs, { streamMode: "custom" })) {
  if (chunk.type === "progress") {
    console.log(`[${chunk.progress}%] ${chunk.message}`);
  }
}
getLangGraphStreamWriter()

can only be called inside the LangGraph execution context; calling it elsewhere throws a runtime error.

Mixed mode – subscribing to multiple streams

Production systems often need both token‑level typing and node‑level progress. Pass an array to streamMode:

for await (const [mode, chunk] of graph.stream(
  { messages: [new HumanMessage("Write a poem about autumn")] },
  { streamMode: ["updates", "messages", "custom"] }
)) {
  switch (mode) {
    case "updates":
      console.log("✅ Node completed:", Object.keys(chunk));
      break;
    case "messages":
      const [msgChunk] = chunk;
      if (msgChunk.content) process.stdout.write(msgChunk.content);
      break;
    case "custom":
      if (chunk.type === "progress") console.log(`
[Progress] ${chunk.message}`);
      break;
  }
}

In mixed mode the generator yields a two‑element tuple [mode, chunk]; single‑mode calls return the raw chunk directly.

Common pitfalls (5 issues)

Missing streaming: true on the LLM – the messages mode will output only at the end.

Calling getLangGraphStreamWriter outside the Graph – results in a runtime error; it must be used inside a node or a tool.

Forgetting to destructure in mixed mode – using chunk directly yields undefined because the generator returns [mode, data].

Sub‑graph tokens are not propagated by default – set subgraphs: true on addNode to forward inner tokens.

Breaking out of the async generator early – may leak resources; use await gen.return() to close the generator cleanly.

Key takeaways

updates

is the most common starting point for node‑level progress when you only need to know which node finished and its output.

True typewriter effect requires both messages mode and streaming: true on the model; omitting either disables token‑level streaming.

Tool progress should be emitted via custom mode together with getLangGraphStreamWriter – this is the only supported path for non‑LLM tools.

Mixed mode lets you handle multiple granularities simultaneously; remember the generator yields a [mode, chunk] tuple.

When nesting graphs, enable subgraphs: true to surface inner token streams; otherwise sub‑graph output remains hidden.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

TypeScriptLLMStreamingTokenNodeLangGraph
James' Growth Diary
Written by

James' Growth Diary

I am James, focusing on AI Agent learning and growth. I continuously update two series: “AI Agent Mastery Path,” which systematically outlines core theories and practices of agents, and “Claude Code Design Philosophy,” which deeply analyzes the design thinking behind top AI tools. Helping you build a solid foundation in the AI era.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.