Why Does V8 Crash During Heap Snapshot? A Deep Dive into the Root Cause
This article investigates a Node.js V8 heap snapshot crash, tracing the segmentation fault to a faulty context extension slot, detailing the assembly analysis, reproducing the issue, and presenting a backported fix for Node‑v14.
While taking a V8 heap snapshot for a certain service, a Node.js process unexpectedly crashed. The following Corefile investigation is shared.
定位过程
The Corefile stack shows the crash occurs in v8::internal::V8HeapExplorer::ExtractContextReferences() at address 0x108ee9e.
(gdb) bt
#0 0x000000000108ee9e in v8::internal::V8HeapExplorer::ExtractContextReferences(v8::internal::HeapEntry*, v8::internal::Context) ()
#1 0x0000000001092435 in v8::internal::HeapSnapshotGenerator::GenerateSnapshot() ()
#2 0x0000000001081ee3 in v8::internal::HeapProfiler::TakeSnapshot(v8::ActivityControl*, v8::HeapProfiler::ObjectNameResolver*, bool) ()
...Since the service runs a Release build without debug symbols, analysis starts from disassembly. The Release build uses -O3, so some functions may be inlined, requiring manual reasoning to locate the offending C++ code.
...
0x000000000108ee2e <+94>: mov 0x17(%rax),%rcx
0x000000000108ee32 <+98>: call 0x108da20 <_ZN2v88internal14V8HeapExplorer20SetInternalReferenceEPNS0_9HeapEntryEPKcNS0_6ObjectEi.constprop.764>
0x000000000108ee37 <+103>: mov %r13,%rdi
0x000000000108ee3a <+106>: call 0xf06740 <_ZN2v88internal7Context10scope_infoEv>
0x000000000108ee3f <+111>: mov %r15,%rdi
0x000000000108ee42 <+114>: mov %rax,-0x38(%rbp)
0x000000000108ee46 <+118>: call 0xffcf50 <_ZNK2v88internal9ScopeInfo23HasContextExtensionSlotEv>
0x000000000108ee4b <+123>: test %al,%al
0x000000000108ee4d <+125>: mov -0x78(%rbp),%rdx
0x000000000108ee51 <+129>: jne 0x108ee90 <_ZN2v88internal14V8HeapExplorer24ExtractContextReferencesEPNS0_9HeapEntryENS0_7ContextE+192>
...
0x000000000108ee90 <+192>: mov 0x1f(%rdx),%rax // get Extension
0x000000000108ee94 <+196>: mov %rax,%rcx
0x000000000108ee97 <+199>: and $0xfffffffffffc0000,%rcx // isUndefined
=> 0x000000000108ee9e <+206>: mov 0x18(%rcx),%rcx
0x000000000108eea2 <+210>: cmp %rax,-0x94a8(%rcx)
0x000000000108eea9 <+217>: je 0x108ee53
0x000000000108eeab <+219>: mov 0x1f(%rdx),%rcx
0x000000000108eeaf <+223>: mov $0x20,%r8d
0x000000000108eeb5 <+229>: mov $0x23bfd12,%edx
0x000000000108eeba <+234>: mov %r12,%rsi
0x000000000108eebd <+237>: mov %rbx,%rdi
0x000000000108eec0 <+240>: call 0x108da20 <_ZN2v88internal14V8HeapExplorer20SetInternalReferenceEPNS0_9HeapEntryEPKcNS0_6ObjectEi.constprop.764>
...The arrow "=>" points to the current %rip where the segmentation fault occurs. It is a mov instruction loading from %rcx+0x18 into %rcx.
=> 0x000000000108ee9e <+206>: mov 0x18(%rcx),%rcxThe register %rcx holds 0x300000000, an invalid address, causing the SIGSEGV and a core dump.
(gdb) p/x $rcx
$22 = 0x300000000Tracing back, %rcx originates from %rax at 0x108ee94 after an AND with 0xfffffffffffc0000. %rax comes from 0x1f(%rdx), which is likely a V8 tag pointer (lowest bit 1). This suggests %rdx holds a V8 object.
The value of %rcx comes from %rax after masking with 0xfffffffffffc0000. %rax is loaded from 0x1f(%rdx), a mis‑aligned offset typical of V8 tag pointers.
The jump at 0x108ee90 originates from 0x108ee51, which follows calls to v8::internal::Context::scope_info and v8::internal::ScopeInfo::HasContextExtensionSlot.
Further inspection of ExtractContextReferences() shows the relevant code:
V8HeapExplorer::ExtractContextReferences():
if (context.has_extension()) {
SetInternalReference(
entry, "extension", context.get(Context::EXTENSION_INDEX),
FixedArray::OffsetOfElementAt(Context::EXTENSION_INDEX));
}The Context::has_extension() method is:
bool Context::has_extension() {
return scope_info().HasContextExtensionSlot() && !extension().IsUndefined();
}Expanding Object::IsUndefined() reveals it ultimately checks against the read‑only roots for the Undefined oddball.
#define IS_TYPE_FUNCTION_DEF(Type, Value) \
...
bool HeapObject::Is##Type(ReadOnlyRoots roots) const {
return Object::Is##Type(roots);
}
bool HeapObject::Is##Type() const { return Is##Type(GetReadOnlyRoots()); }
ODDBALL_LIST(IS_TYPE_FUNCTION_DEF)
#undef IS_TYPE_FUNCTION_DEFV8 manages small objects in 256 KB pages; the page base address is obtained by masking with 0xfffffffffffc0000. The MemoryChunk stored at offset 0x18 of a page matches the instruction that caused the crash, indicating %rcx points to a MemoryChunk structure.
(gdb) p &(('v8::internal::MemoryChunk' *)0).heap_
$6 = (class v8::internal::Heap **) 0x18
(gdb) x/i $rip
=> 0x108ee9e <_ZN2v88internal14V8HeapExplorer24ExtractContextReferencesEPNS0_9HeapEntryENS0_7ContextE+206>: mov 0x18(%rcx),%rcxThe crash occurs because the code treats the Extension slot as a heap object, but it actually contains an SMI ( 0x300000000), leading to an invalid memory access.
Inspecting the Context object at 0x26a71b4be039 shows the third slot holds the SMI value 3 (encoded as 0x300000000), while the second slot is undefined. The ScopeInfo incorrectly marks has context extension slot as true, causing the bug.
(gdb) v8 i 0x26a71b4be039
[Context]
- length: 5
- scope_info: <ScopeInfo 0x286aaa782231>
- previous: <Oddball 'undefined' 0x286aaa780471>
- native_context: <NativeContext 0xf0168a00121>
- context
- 0: <ScopeInfo 0x286aaa782231> [ScopeInfoIndex]
- 1: <Oddball 'undefined' 0x286aaa780471> [PreviousIndex]
- 2: smi: 3 (0x300000000) [ExtensionIndex]
- 3: <PromiseCapability 0x26a71b4be0d9> [GlobalProxyIndex]
- 4: <JsArray 0x26a71b4be0f9> [EmbedderDataIndex]
...The ScopeInfo flags indeed show has context extension slot set:
(gdb) v8 i 0x286aaa782231
[ScopeInfo]
- flags: (0x40101c4)
- has context extension slotThis contradictory state points to a runtime bug. The V8 source shows that an extension slot exists only when scope.sloppy_eval_can_extend_vars_ is true:
bool HasContextExtensionSlot() const {
switch (scope_type_) {
case MODULE_SCOPE:
case WITH_SCOPE:
return true;
default:
DCHECK_IMPLIES(sloppy_eval_can_extend_vars_,
scope_type_ == FUNCTION_SCOPE ||
scope_type_ == EVAL_SCOPE ||
scope_type_ == BLOCK_SCOPE);
DCHECK_IMPLIES(sloppy_eval_can_extend_vars_, is_declaration_scope());
return sloppy_eval_can_extend_vars_;
}
UNREACHABLE();
}Most contexts (e.g., PromiseAllResolveElementContext) lack an extension slot. The observed context matches the layout of PromiseAllResolveElementContextSlots, where the third slot is the ExtensionIndex and should be a heap object, not an SMI.
Thus the bug stems from an incorrect flag in ScopeInfo that makes V8 treat an SMI as a heap object during heap snapshot generation, causing the crash.
复现用例
A minimal reproducer validates the analysis:
function that() {
const p = new Promise(resolve => {
setTimeout(resolve, 1);
});
Promise.all([p]); // triggers PromiseAllResolveElementContext
}
that();
const v8 = require('v8');
const fs = require('fs');
const stream = fs.createWriteStream('./node.heapsnapshot');
v8.getHeapSnapshot().pipe(stream);Running this on Node‑v14 reproduces the crash, while Node‑v16 works correctly.
(gdb) r
Starting program: .../node ./test.js
...
Program received signal SIGSEGV, Segmentation fault.
0x000000000100697e in v8::internal::V8HeapExplorer::ExtractContextReferences(...)
(gdb) p/x $rcx
$1 = 0x100000000The issue is fixed in newer V8 revisions. Backporting the patch from V8 revision 2277806 to Node‑v14 resolves the crash.
The corresponding Node.js issue is #42558 . The fix is expected in the next Node‑v14 release.
Node Underground
No language is immortal—Node.js isn’t either—but thoughtful reflection is priceless. This underground community for Node.js enthusiasts was started by Taobao’s Front‑End Team (FED) to share our original insights and viewpoints from working with Node.js. Follow us. BTW, we’re hiring.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
