Artificial Intelligence 16 min read

When AI Knows Too Much: How MemPrivacy Secures Agent Memory

MemPrivacy introduces a reversible, fine‑grained privacy layer for edge‑cloud agents, outperforming OpenAI's privacy‑filter by over 50 % F1 while keeping system utility loss under 2 %, thus enabling agents to remain useful without exposing raw sensitive data.

Machine Heart

May 15, 2026

When AI Knows Too Much: How MemPrivacy Secures Agent Memory

In the emerging AI memory era, agents increasingly act like personal assistants, remembering habits, schedules, health status, and building a personal profile. This raises a critical question: if these memories are stored in the cloud, is privacy still safe?

OpenAI Privacy‑Filter vs. MemPrivacy

On April 22, OpenAI open‑sourced a lightweight privacy‑filter model (1.5 B parameters, ~50 M activation parameters, 128k context) that scans text, identifies PII, and replaces it with semantic tags such as [PRIVATE_PERSON]. While this approach preserves some semantics compared to simple masking, it only provides eight coarse tags, which is insufficient for long‑term, personalized agents that need to understand nuanced contexts.

MemTensor, in collaboration with HONOR and Tongji University, released MemPrivacy , a privacy‑preserving framework designed specifically for edge‑cloud agents. Unlike generic PII masking, MemPrivacy adopts local reversible pseudo‑anonymization and introduces fine‑grained typed placeholders (e.g., <Health_Info_1>) that retain semantic structure while keeping raw data on the device.

Three‑Step Workflow

Edge‑side upstream desensitization : The user’s utterance is processed locally by a lightweight MemPrivacy model, which detects privacy fragments and replaces them with type‑specific placeholders. The mapping between the placeholder and the original value is stored only in a local database.

Cloud‑side secure processing : The cloud model receives the placeholder‑rich text (e.g., “My blood pressure today is <Health_Info_1> ”). It can still reason about the fact that this is a health metric, generate advice, and store the information in the agent’s memory without ever seeing the raw number.

Edge‑side downstream recovery : When the cloud replies (e.g., “Your blood pressure <Health_Info_1> is high”), the edge device restores the original value from the local store and presents it to the user.

This design ensures that the cloud sees the structure but never the plaintext, achieving the goal “make the agent usable, but invisible.”

Fine‑Grained Privacy Taxonomy

MemPrivacy defines a four‑level privacy tree (PL1‑PL4) based on recognizability, potential harm, and usability:

PL4 – Critical Core : Plaintext passwords, API keys, session tokens, etc. Zero‑tolerance: never sent to the cloud.

PL3 – High‑Risk Sensitive : Identity documents, detailed medical diagnoses, precise location traces. Strong protection needed.

PL2 – Identity Anchor : Real names, addresses, phone numbers, private emails. Can be used for personalization but must be guarded.

PL1 – Basic Profile : Habits, interests, non‑diagnostic emotions. Safe for long‑term memory.

The taxonomy lets users adjust protection thresholds, turning privacy filtering from a blunt hammer into a nuanced tool.

Benchmark Results

To evaluate MemPrivacy, the team built two benchmarks:

MemPrivacy‑Bench : 200 user dialogues, >155 k privacy items, bilingual (Chinese/English) detection.

PersonaMem‑v2 : An out‑of‑distribution long‑text dialogue set for cross‑domain testing.

On MemPrivacy‑Bench, OpenAI’s privacy‑filter achieved a composite F1 of 35.50 %, while MemPrivacy‑4B‑RL reached 85.97 % – a 50.47 % absolute gain. On PersonaMem‑v2, MemPrivacy still outperformed OpenAI by ~9 %.

In addition to accuracy, system utility was measured. Traditional irreversible masking caused accuracy drops of 26.67 %–41.87 % across three memory systems, effectively crippling the agent. MemPrivacy, even with all protection levels (PL4+PL3+PL2) enabled, limited utility loss to 0.71 %–1.60 %; protecting only the highest‑risk PL4 resulted in less than 0.89 % loss.

Training Pipeline

MemPrivacy models are built on the Qwen‑3 series (0.6 B, 1.7 B, 4 B). Training proceeds in two stages:

SFT (Supervised Fine‑Tuning) : 26 K high‑quality multi‑turn dialogues teach the model basic privacy detection, type classification, and placeholder substitution.

GRPO (Reward‑Guided Policy Optimization) : A structured reward based on extraction F1 directly optimizes the model for ambiguous, context‑dependent privacy cues (e.g., “I’m stressed” vs. “My blood pressure is 160/110”).

This RL stage improves recall and precision on hard cases where privacy boundaries are fuzzy.

Comparison with Large General Models

Even against massive models such as GPT‑5.2, Gemini‑3.1‑Pro, and DeepSeek‑V3.2‑Think, the 4 B MemPrivacy model (and the 0.6 B variant) consistently achieved higher F1 scores on both benchmarks, demonstrating that privacy extraction is not solved by sheer parameter scale but by task‑specific design.

Implications and Availability

MemPrivacy provides a practical, high‑precision solution for edge‑cloud agents that need long‑term, personalized memory without compromising user privacy. The model weights, benchmarks, and code are fully open‑source (paper: https://arxiv.org/pdf/2605.09530, code: https://github.com/MemTensor/MemPrivacy, model hub: https://huggingface.co/collections/IAAR-Shanghai/memprivacy). This makes it immediately usable for AI builders and enterprises facing strict data‑protection regulations.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

AI privacy benchmark agent-memory edge-cloud F1 MemPrivacy

Written by

Machine Heart

Professional AI media and industry service platform

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.