Artificial Intelligence 17 min read

GBrain’s 14K‑Star Open‑Source System Solves AI Agent Forgetting

GBrain, the open‑source AI agent memory platform with over 14,000 GitHub stars, uses a three‑layer architecture—Markdown‑based truth source, hybrid retrieval with PGLite, and 34 skill workflows—to eliminate agent forgetting, achieve a 31.4% retrieval boost, and provide Python integration via the MCP protocol, while outlining practical deployment pitfalls.

Data Party THU

Jun 11, 2026

GBrain’s 14K‑Star Open‑Source System Solves AI Agent Forgetting

Overview

GBrain is an open‑source AI agent memory system released by YC president Garry Tan. It has attracted more than 14,000 GitHub stars and aims to solve the "forgetting" problem that plagues modern agents. On a 240‑page knowledge base it achieves P@5 49.1 % and R@5 97.9 % , a 31.4 % improvement over pure vector search.

Three‑Layer Architecture

Layer 1 – Brain Repo (Truth Source)

The bottom layer stores every entity (person, company, concept, meeting, etc.) as a Markdown file under Git version control. Each file contains two sections:

Compiled Truth : a concise, up‑to‑date summary written at the top.

Timeline : an append‑only log that preserves original evidence and timestamps.

This design lets humans and AI share the same source, searchable with tools like grep or Obsidian, and recoverable from Git if the database crashes.

Layer 2 – Retrieval Index (Hybrid Engine)

The core retrieval layer runs on PGLite (an embedded Postgres compiled to WebAssembly). The query pipeline is:

User query (optionally expanded to two alternatives by Claude Haiku).

Parallel search:

HNSW vector search (1536‑dim, cosine similarity).

PostgreSQL tsvector full‑text search with weighted fields (title > compiled > timeline).

RRF fusion (score = Σ 1/(60 + rank)).

Four‑stage deduplication (keep top 3 snippets per page, Jaccard > 0.85).

Backlink‑weighted re‑ranking (pages linked by many others receive a boost).

Return top‑K results.

Combining semantic vector search with exact keyword matching and backlink weighting explains the 31.4 % gain over pure vector search.

Layer 3 – 34 Skills Workflow

GBrain follows the "Thin Harness, Fat Skills" philosophy: runtime code is minimal, while intelligence lives in 34 Markdown skill files. The skills are grouped into five categories:

Always‑on : signal‑detector, brain‑ops (continuous monitoring and scheduling).

Content Ingestion : ingest, meeting‑ingestion, media‑ingest (turn emails, meetings, tweets into structured brain pages).

Research Synthesis : research‑synthesizer (extract themes across multiple pages).

Brain Ops : enrich, maintain, citation‑fixer (enrich entities, deduplicate, fix citation chains).

Identity Setup : soul‑audit, setup, briefing.

Each skill file defines a complete workflow: when to trigger, what to read, what to write, where to write, and quality standards. Agents read these files to know how to act.

Zero‑LLM Knowledge Graph

When a page is added, GBrain extracts entities and relations using regular expressions and string matching—no LLM calls. Extracted relation types include attended, works_at, invested_in, founded, and advises. A tiered stubbing mechanism upgrades entities from a minimal stub (Tier 3) to fully enriched pages (Tier 1) based on frequency of appearance, web search, and social signals. This enables pure graph queries such as “who invested in the database company related to Alice?” without invoking an LLM.

Dream Cycle – Night‑time Consolidation

The "Dream Cycle" mimics human sleep memory consolidation:

Daytime: Signal Detector captures emails, tweets, schedules, etc., in parallel without blocking.

When the agent responds, brain‑ops first checks the brain repo for prior knowledge.

After the response, new information is written to brain pages and entity relations are extracted.

Nighttime: Minions (GBrain’s task queue) run deterministic batch jobs—pulling posts, filling citations, deduplicating, rebuilding the index—at zero LLM token cost. Minions complete in ~753 ms, whereas a sub‑agent fan‑out approach often times out.

Garry Tan runs 21 cron jobs continuously; the brain updates while you sleep.

Python Integration via MCP

GBrain is written in TypeScript but can be accessed from any language through the MCP (Memory‑Channel‑Protocol) server.

git clone https://github.com/garrytan/gbrain.git
cd gbrain
bun install && bun link
gbrain init   # local brain, ready in ~2 s

Start the MCP server:

# HTTP mode for Python clients
gbrain serve --http --port 3131

Python client example (requires the mcp package):

import asyncio, json
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client

async def query_brain(query: str) -> list[dict]:
    server_params = StdioServerParameters(
        command="gbrain",
        args=["serve"]
    )
    async with stdio_client(server_params) as (read, write):
        async with ClientSession(read, write) as session:
            await session.initialize()
            result = await session.call_tool(
                "gbrain_search",
                arguments={"query": query, "top_k": 5}
            )
            return json.loads(result.content[0].text)

async def remember_fact(title: str, content: str, tags: list[str] = None):
    server_params = StdioServerParameters(command="gbrain", args=["serve"])
    async with stdio_client(server_params) as (read, write):
        async with ClientSession(read, write) as session:
            await session.initialize()
            result = await session.call_tool(
                "gbrain_write",
                arguments={"title": title, "content": content, "tags": tags or []}
            )
            return json.loads(result.content[0].text)

async def main():
    await remember_fact(
        title="LangGraph State Machine Tips",
        content="When nesting subgraphs, each checkpoint is saved separately. You must declare a shared_state field in the parent graph, otherwise child state is not persisted.",
        tags=["langgraph", "agent", "state-machine"]
    )
    results = await query_brain("LangGraph checkpoint missing in subgraph")
    for r in results:
        print(f"📄 {r['title']} (score: {r['score']:.3f})")
        print(f"   {r['snippet'][:200]}…
")

asyncio.run(main())

This code writes any Python agent’s memory into GBrain and can be wrapped as a LangGraph tool.

Claude Code Integration

For Claude Code users, add an MCP server entry to claude_settings.json:

{
  "mcpServers": {
    "gbrain": {"command": "gbrain", "args": ["serve"]}
  }
}

Then you can ask Claude Code, “Search my notes about React state management,” and it will invoke the GBrain MCP tool automatically.

Practical Pitfalls

Environment lock‑in : GBrain runs on Bun + TypeScript. Python‑first users must install Bun, manage Node versions (Node 22 LTS solved a WASM compile issue), and allocate ~15 minutes for setup.

Cold‑start latency : First query incurs a 3‑5 s delay while PGLite loads its shared buffer; subsequent queries are millisecond‑fast.

Skill file language : Skills are written in English with OpenClaw‑specific placeholders (e.g., {agent_name}). Adapting them to other agents requires manual edits.

Tech‑stack lock‑in : Official support exists only for OpenClaw and Hermes. Integrating LangGraph, CrewAI, or AutoGen needs custom MCP adapters.

Single‑operator design : Not suited for team‑wide knowledge bases.

Model dependence : Retrieval quality peaks with high‑end models (Claude Opus 4.6, GPT‑5.4); lower‑tier models show noticeable degradation.

Rapid iteration : Current version ~v0.30; API changes are frequent, with breaking changes considered normal.

Long‑Term Outlook

The author argues that agent memory will migrate from framework‑embedded modules to a standardized protocol layer (MCP). GBrain demonstrates the pattern but is too personalized to become a universal standard. Anticipated developments include:

Multi‑agent collaboration where several agents share a single brain.

Native Python SDKs driven by community demand.

Hosted cloud versions to avoid self‑hosting PGLite/Supabase.

Standardization of the memory protocol, with LangGraph and CrewAI likely to release official MCP memory servers within a year.

Architecture Diagram

The diagram (omitted here) shows the signal flow from external inputs, through the Skills layer (decision to remember), the Retrieval layer (search), and the Brain Repo (truth source), with the knowledge graph feeding back into retrieval for a positive feedback loop.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Python Knowledge Graph MCP protocol agent architecture Hybrid Retrieval AI memory gbrain

Written by

Data Party THU

Official platform of Tsinghua Big Data Research Center, sharing the team's latest research, teaching updates, and big data news.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.

Overview

Three‑Layer Architecture

Layer 1 – Brain Repo (Truth Source)

Layer 2 – Retrieval Index (Hybrid Engine)

Layer 3 – 34 Skills Workflow

Zero‑LLM Knowledge Graph

Dream Cycle – Night‑time Consolidation

Python Integration via MCP

Claude Code Integration

Practical Pitfalls

Long‑Term Outlook

Architecture Diagram

Data Party THU

How this landed with the community

Was this worth your time?

0 Comments

Layer 1 – Brain Repo (Truth Source)

Layer 2 – Retrieval Index (Hybrid Engine)

Layer 3 – 34 Skills Workflow

Claude Code Integration