Do Large Language Models Mirror Human Brain Language Processing? Google’s Groundbreaking Findings
Google researchers discovered a linear relationship between brain activity recorded during natural conversation and the internal embeddings of a speech‑to‑text large language model, revealing that acoustic and lexical representations from the model can accurately predict neural responses in both language comprehension and production.
Overview
Google Research recorded cortical electrocorticography (ECoG) from participants engaged in natural, open‑ended conversations while they spoke and listened. Over 100 hours of neural data were collected. The speech‑to‑text large language model Whisper was run on the same audio, and three levels of internal representations were extracted for each word: low‑level acoustic embeddings, mid‑level phonetic embeddings, and high‑level lexical (word‑level) embeddings.
Methodology
For each word the researchers aligned the model embeddings with the ECoG signal in a time window from –2 s to +2 s relative to word onset. A linear encoding model (ridge regression) was trained on a subset of the data to map each embedding dimension to neural activity across electrodes. Model performance was evaluated on held‑out dialogue segments by computing the correlation between predicted and observed signals.
Key Findings
During speech perception, acoustic embeddings best predicted activity in the superior temporal gyrus (STG) shortly after the word was heard, whereas lexical embeddings peaked several hundred milliseconds later in Broca’s area (inferior frontal gyrus, IFG), reflecting semantic processing. During speech production the temporal order reversed: lexical embeddings predicted IFG activity ~500 ms before articulation, followed by acoustic embeddings predicting motor‑cortex activity as the phonetic plan was executed. The full temporal profile showed a sequence of planning in language areas, execution in motor areas, and auditory monitoring after speech.
Quantitatively, the encoding model achieved peak Pearson correlations of ~0.3–0.4 for the best electrodes, and the latency of the peak prediction matched known neurophysiological timings (STG ≈ 100 ms, IFG ≈ 300–500 ms).
Implications
The results demonstrate that embeddings from a speech‑to‑text LLM trained solely for transcription capture statistical regularities that align with human cortical dynamics during natural language processing. This provides a concrete computational framework for linking artificial neural network representations to brain activity, and suggests that large models can serve as proxies for studying the neural basis of language.
Related Work
Nature Neuroscience (2022) showed that human listeners predict upcoming words and that prediction confidence modulates post‑onset surprise.
Nature Communications (2024) reported geometric similarity between LLM embedding spaces and brain‑derived representational spaces.
Future Directions
The team plans to design biologically‑inspired neural architectures that exploit the observed brain‑model parallels to improve real‑world information processing and to test whether such models can better predict neural responses in more complex linguistic contexts.
References
https://research.google/blog/deciphering-language-processing-in-the-human-brain-through-llm-representations/
https://www.nature.com/articles/s41562-025-02105-9
https://x.com/GoogleAI/status/1903149951166902316
https://x.com/rohanpaul_ai/status/1903373048260284868
Code example
收
藏
,
分
享
、
在
看
,
给
个
三
连
击呗!Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
