How TencentDB Agent Memory Boosts Recall by 167% and Redefines Agent Context Management

The article examines the inherent limits of traditional AI context memory, surveys three common memory implementations, introduces TencentDB Agent Memory's hierarchical long‑term and symbolic short‑term architecture, presents benchmark gains (recall up to 167% and token savings over 60%), and provides step‑by‑step deployment and optimization guidance.

AI Architecture Path
AI Architecture Path
AI Architecture Path
How TencentDB Agent Memory Boosts Recall by 167% and Redefines Agent Context Management

Problem with Traditional AI Context Memory

Typical AI agents suffer from a "context blackboard" limitation: the context window has a hard token cap, old content must be deleted or crudely summarized, and critical user preferences are easily lost.

Survey of Existing Memory Approaches

Full injection into context window – simplest to implement and works for short dialogs, but token limits cause rapid window overflow, forcing a complete reset of history and breaking cross‑session memory.

Raw dialogue slices stored in a vector store – removes the token ceiling and persists data, yet treats every utterance equally; factual recall is low because retrieval relies solely on similarity scores.

Model‑generated summarization – compresses context but discards original information irreversibly, making detailed verification impossible.

TencentDB Agent Memory Architecture

To overcome these drawbacks, the solution adopts a dual‑layer design: a hierarchical long‑term memory (L0‑L3) combined with a symbolic short‑term memory.

L0 Raw Dialogue Layer: stores complete JSONL logs locally (~/.openclaw/memory-tdai/raw/).
L1 Atomic Fact Layer: AI extracts structured independent facts (preferences, constraints, past pitfalls) with deduplication and conflict detection.
L2 Scene Clustering Layer: aggregates facts per project/scene into readable Markdown files (scenes/*.md).
L3 User Persona Layer: continuously iterates a user profile (persona.md) that captures long‑term coding style, tech‑stack bias, and collaboration rules.

All layers are persisted in a local SQLite database (or optionally the TCVDB cloud vector store), ensuring zero external API dependence and full privacy control.

Short‑Term Memory Compression

The short‑term side uses a Mermaid canvas and three automatic offloading levels to keep the active context lightweight.

L1 Real‑time Summary – triggered when context occupies ≥60% of the window; replaces raw tool output with a single concise line, preserving information without loss.

L2 Mermaid Canvas – periodically generates a topology diagram that condenses hundreds of tokens of log data into a few hundred characters.

L3 Deep Clean – aggressively archives expired tasks (≥80% compression, ≥95% for urgent cases), instantly pulling the context back under safe limits.

Hybrid Retrieval Strategy

To avoid the pitfalls of pure vector or pure keyword search, the system fuses BM25 keyword results with embedding‑based semantic matches using Reciprocal Rank Fusion (RRF), achieving both precise keyword hits and fuzzy semantic coverage.

Benchmark Results

Metric                     OpenClaw (baseline)   Agent Memory   Improvement
Long‑term memory accuracy      47.85%               76.10%        +59%
User fact recall               29.63%               79.07%        +167%
User preference tracking        66.67%               83.45%        +25%
Personalized recommendation    46.67%               76.36%        +64%

Additional task‑level tests show token reduction up to 61.38% and success‑rate gains: WideSearch (33%→50%, +51.52%) and SWE‑bench (58.4%→64.2%, token saving 33%).

Deployment Options

Three practical ways to install the memory component:

OpenClaw plugin

# Install memory plugin
openclaw plugins install @tencentdb-agent-memory/memory-tencentdb
# Restart gateway
openclaw gateway restart

Hermes Agent Docker image (enterprise batch deployment)

docker pull tencentcloud/hermes-agent-memory:latest
docker run -d -p 3000:3000 hermes-agent-memory:latest

Standalone service (custom Agent development)

# Clone source
git clone https://github.com/TencentCloud/TencentDB-Agent-Memory.git
cd TencentDB-Agent-Memory
npm install
# Start service
npm start

Python SDK quick start example:

# pip install tencentdb-agent-memory-sdk
from tencent_agent_memory import AgentMemory
memory = AgentMemory(storage="sqlite", db_path="./local_memory.db", user_id="dev_001")
node_id = await memory.offload(task_id="task01", step=1, content="massive search results", action="web_search")
canvas = await memory.update_canvas(task_id="task01", node_id=node_id, status="done")

Practical Tips & Pitfalls

When switching storage to TCVDB, edit memory.yaml (set storage: tcvdb) and provide the Tencent Cloud API key; never modify underlying library files directly.

For external embeddings (e.g., OpenAI, Tongyi Qianwen), configure the endpoint and secret in the config; otherwise a lightweight local vectorizer is used.

Avoid deleting the memory-tdai folder manually; use the SDK or plugin commands to clean invalid memories, otherwise the full‑trace index breaks.

In multi‑tenant scenarios, ensure each user has a distinct user_id so that memories are isolated and personas do not mix.

AI‑Native Database Ecosystem

TencentDB Agent Memory is part of Tencent Cloud’s AI‑native database suite, which includes:

DatabaseClaw – a 24/7 DB‑ops Agent handling inspections, slow‑SQL diagnosis, and anomaly detection with four‑layer permission isolation.

TDSQL Boundless – a next‑generation multimodal distributed database that unifies relational transactions, vector search, and full‑text search.

TDSQL‑C Cloud‑Native Database – a dual‑engine (MySQL + PostgreSQL) system offering 200%+ TCO reduction and zero‑RPO replication, seamlessly integrating with AI services like Cursor and FastGPT.

Together they form a complete data foundation from the storage engine up through the Agent memory layer to AI‑driven operational agents.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

benchmarkHybrid retrievalLong-term memoryAI memoryShort-term MemoryAgent Context
AI Architecture Path
Written by

AI Architecture Path

Focused on AI open-source practice, sharing AI news, tools, technologies, learning resources, and GitHub projects.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.