How MemWeaver Combines Behavioral and Cognitive Memory to Rebuild LLM Personalization

MemWeaver introduces a hierarchical memory that fuses behavior‑level and cognition‑level user signals, enabling large language models to generate more personalized content across multiple tasks, with extensive experiments, ablations, and an efficient incremental update mechanism demonstrating superior performance over strong baselines.

PaperAgent
PaperAgent
PaperAgent
How MemWeaver Combines Behavioral and Cognitive Memory to Rebuild LLM Personalization

Why Traditional Personalization Methods Fall Short

Conventional systems rely on clicks, purchases, and dwell time, which work well for recommendation but lack the depth needed for generative AI. Clicks reveal what users chose but not why they like it; historical text contains tone, style, and evolving preferences; and treating user history as a flat text corpus makes it hard for models to capture underlying structure.

The core challenge is not the scarcity of user data but how to organize massive, continuous, and evolving textual interactions into a memory that models can effectively exploit.

What MemWeaver Does

MemWeaver upgrades user history from a "flat text" representation to a "hierarchical memory" by explicitly modeling two crucial structures:

Temporal structure : recent behaviors are more important than older ones.

Semantic structure : distant actions can be highly related in topic.

To capture both dimensions, MemWeaver builds two complementary memory modules.

1. Behavioral Memory

Behavioral memory records what the user "did". Each historical textual interaction becomes a node, linked by:

Temporal edges between adjacent actions.

Semantic edges between actions with similar topics.

The model performs a preference‑biased random walk on this graph to extract a behavior chain most relevant to the current task, providing fine‑grained, traceable context.

2. Cognitive Memory

Cognitive memory captures what the user "cares about" over the long term. The user’s history is divided into temporal stages; each stage is summarized to extract stable interests, expression styles, and value orientations, which are then merged into a global user portrait. This abstraction offers a stable, high‑level bias.

3. Why Both Memories Are Needed

Personalized generation must "be like this person" while also "answer the specific query". Behavioral memory supplies concrete evidence; cognitive memory supplies global preference. Without either, the model either lacks direction or lacks detail.

Experimental Setup

Experiments were conducted on the LaMP benchmark (six publicly available tasks covering personalized classification, rating prediction, headline generation, academic title generation, and tweet rewriting). The protocol follows LaMP's temporal split, dividing each user’s history into training, validation, and test sets.

Metrics used:

LaMP‑1, LaMP‑2: Accuracy, F1

LaMP‑3: MAE, RMSE

LaMP‑4, LaMP‑5, LaMP‑7: ROUGE‑1, ROUGE‑L

Baselines included Vanilla, Random, Recency, BM25, BGE, ROPG, CFRAG, evaluated on two backbone models: Qwen3‑8B and Llama‑3.1‑8B‑Instruct.

Main Results: Effectiveness Across the Board

MemWeaver achieved the best performance on all 12 evaluation metrics, demonstrating consistent gains not only on generation tasks but also on classification and rating prediction. The improvement is not isolated to a single task; it spans tasks, metrics, and backbone models, indicating a robust overall enhancement.

The results confirm that temporal evolution and semantic connections in user history are essential signals for personalized modeling.

Ablation Studies: Why It Works

Removing components revealed their contributions:

Without cognitive memory, performance drops, showing that long‑term preference abstraction is necessary.

Without behavioral memory, performance degrades even more, highlighting the importance of concrete historical evidence.

Eliminating either temporal edges or semantic edges harms results, proving both structures are valuable.

Discarding the edge‑weight mechanism leads to the most severe degradation, indicating that modeling connection strength is critical.

Thus, MemWeaver’s gains stem from its structured design rather than merely adding extra modules.

Case Study: Does the Model Really Understand the User?

In a birthday‑gift recommendation scenario, a generic model suggests common items (books, gift boxes). MemWeaver, leveraging the user’s past mentions of handmade crafts, specialty coffee, and experiential interests, abstracts underlying preferences and generates a recommendation that aligns closely with the user’s personality.

In other words, MemWeaver aims to understand a user’s stable interests and latent preferences from their past utterances, not just recall exact past statements.

Incremental Update for Real‑World Systems

MemWeaver includes an incremental update mechanism so that new behaviors can be added without rebuilding the entire memory:

New actions are inserted as nodes in the behavioral memory.

Cognitive memory integrates new stages via lightweight aggregation.

Experiments show that this strategy achieves performance comparable to full reconstruction while dramatically reducing computational cost, making it suitable for online personalization.

Core Takeaway

As generative AI moves into real applications, personalization must evolve from merely recognizing what users like to understanding why they like it, how preferences change, and how to maintain consistency across contexts. MemWeaver demonstrates that a hierarchical memory combining behavioral and cognitive signals can provide this deeper understanding.

https://arxiv.org/abs/2510.07713
MemWeaver: A Hierarchical Memory from Textual Interactive Behaviors for Personalized Generation
Shuo Yu, Mingyue Cheng, Daoyu Wang, Qi Liu, Zirui Liu, Ze Guo, Xiaoyu Tao
State Key Laboratory of Cognitive Intelligence, University of Science and Technology of China
https://github.com/fishsure/MemWeaver
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

incremental updatepersonalized generationhierarchical memorybehavioral memorycognitive memoryLaMP benchmarkLLM personalization
PaperAgent
Written by

PaperAgent

Daily updates, analyzing cutting-edge AI research papers

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.