How MFS Unifies 20+ Data Sources with a Single Verb Set and How Open Tag Replicates Claude Tag
The article dissects Zilliztech's MFS, showing how a thin‑client, stateful‑server architecture uses a unified verb set to access over twenty heterogeneous data sources, and explains how the Open Tag demo re‑creates Claude Tag's brain‑memory‑tools workflow on top of MFS while highlighting its design trade‑offs and production‑readiness limits.
Three layers: Claude Tag, MFS, Open Tag
Claude Tag is the Anthropic‑hosted tag workflow. MFS ( zilliztech/mfs) is the open‑source infrastructure that unifies more than twenty data sources into a searchable file‑tree and serves as the memory engine. Open Tag ( examples/open-tag-skill/) is a reference implementation that reproduces the Claude Tag workflow on top of MFS.
MFS: a unified verb set for all data sources
A context harness for AI agents — and for building them: one unified workspace over your code, memory, skills, docs, messages, and every data source.
MFS follows a thin‑client + stateful‑server model. The client (Rust‑based mfs CLI, Python/TypeScript SDKs) is stateless and provides commands such as mfs-ingest (register and index a source) and mfs-find (search indexed content). All state lives in mfs-server, which hosts configuration, credentials, task queues, workers, and connectors.
Same verbs, any scheme
Resources are addressed with a URI scheme <scheme>://. The verb set is: ls / tree / cat / head / tail / grep / search Examples: ls github://org/repo – list a GitHub repository search slack://workspace – search Slack history
Search vs Browse
Search (requires indexing): mfs search performs hybrid dense‑vector + BM25 retrieval; mfs grep does exact or full‑text matching without indexing.
Browse (no indexing): ls / tree / cat / head / tail lets agents progressively locate bytes or records.
Each hit returns a locator, e.g. {"lines":[s,e]} for text or a primary‑key dictionary for structured records, enabling precise fetching.
Backend switching by configuration
MFS separates configuration from code. Local data live under $MFS_HOME (default ~/.mfs). Production environments are enabled by swapping configuration values:
Vector store: Milvus Lite → self‑hosted Milvus or Zilliz Cloud
Metadata DB: SQLite → Postgres Cache: local filesystem → S3 Embedding engine: local ONNX model → OpenAI / Gemini / Voyage / Ollama
Image description: disabled → OpenAI / Anthropic / Gemini
The design permits a zero‑key, zero‑GPU local start (≈600 MB model download to ~/.mfs/) and a seamless upgrade to a fully managed cloud backend without code changes.
Open Tag: mapping Claude Tag’s three elements
Open Tag reproduces Claude Tag’s Brain‑Memory‑Tools triad with open‑source components:
Brain : Anthropic‑hosted model service → CLI backend claude or codex Memory : Anthropic‑hosted → MFS‑indexed, operator‑authorized context
Tools : Anthropic platform tools → MFS connectors exposing external search and workspace utilities
The Slack bridge ( slack_socket_agent.py) performs five steps:
Receive an app_mention event via Socket Mode.
Fetch the thread with conversations.replies (limit 30).
Post a temporary “working” placeholder reply.
Invoke scripts/opentag_agent.py with the selected backend.
Replace the placeholder with the agent’s answer.
The bridge never answers directly; it only forwards the conversation to a fresh CLI agent process.
Fresh agent per mention
Each @OpenTag mention spawns a new opentag_agent.py process. Backend invocation examples:
claude -p --dangerously-skip-permissions --add-dir <workdir> --add-dir <skill-dir> --add-dir <memory-root> codex exec --dangerously-bypass-approvals-and-sandbox … --output-last-message <tmp>(default three retry attempts via OPENTAG_BACKEND_ATTEMPTS)
This stateless approach avoids cross‑conversation state, simplifying debugging at the cost of re‑initialising the model for each request.
Memory boundaries
Agent visibility is limited by the MFS_ALLOWED_SCOPES environment variable. Helper scripts ( mfs_search.py, mfs_cat.py) call is_scope_allowed() before invoking the /v1/search endpoint; disallowed scopes cause exit code 2.
Security posture
This is a demo/reference implementation, not a production security boundary. It lacks a hardened sandbox, multi‑user policy engine, audit system, or approval flow.
MFS provides more than twenty connectors (Postgres, MongoDB, BigQuery, S3, GitHub, Jira, Slack, Discord, Gmail, Notion, etc.) all self‑hosted, keeping data and credentials on‑premise.
Credential design
Connector TOML files store only references, e.g.:
# Only references, real values never stored in config
token = "env:SLACK_BOT_TOKEN"
password = "file:/run/secrets/db_password"Actual secrets reside in environment variables or mounted files, allowing configuration files to be version‑controlled safely.
Use as a foundation
MFS can be called directly from application code via its Python/TypeScript SDKs, used by the CLI to build custom Skills or MCP servers, and handles indexing pipelines, chunking, embedding, vector store, cache, and connector orchestration, sparing upper‑layer applications from implementing these pieces.
Conclusion
MFS unifies over twenty data sources into a searchable file tree, exposing a single verb set for agents. A zero‑key, zero‑GPU local start upgrades to Zilliz Cloud by changing configuration only. Open Tag builds on MFS to replicate Claude Tag’s Brain‑Memory‑Tools workflow, using a thin Slack bridge and fresh agents per mention, while explicitly stating it is a demo rather than a hardened production security boundary.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Shuge Unlimited
Formerly "Ops with Skill", now officially upgraded. Fully dedicated to AI, we share both the why (fundamental insights) and the how (practical implementation). From technical operations to breakthrough thinking, we help you understand AI's transformation and master the core abilities needed to shape the future. ShugeX: boundless exploration, skillful execution.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
