How MFS Unifies 20+ Data Sources with a Single Verb Set and How Open Tag Replicates Claude Tag

The article dissects Zilliztech's MFS, showing how a thin‑client, stateful‑server architecture uses a unified verb set to access over twenty heterogeneous data sources, and explains how the Open Tag demo re‑creates Claude Tag's brain‑memory‑tools workflow on top of MFS while highlighting its design trade‑offs and production‑readiness limits.

Shuge Unlimited
Shuge Unlimited
Shuge Unlimited
How MFS Unifies 20+ Data Sources with a Single Verb Set and How Open Tag Replicates Claude Tag

Three layers: Claude Tag, MFS, Open Tag

Claude Tag is the Anthropic‑hosted tag workflow. MFS ( zilliztech/mfs) is the open‑source infrastructure that unifies more than twenty data sources into a searchable file‑tree and serves as the memory engine. Open Tag ( examples/open-tag-skill/) is a reference implementation that reproduces the Claude Tag workflow on top of MFS.

MFS: a unified verb set for all data sources

A context harness for AI agents — and for building them: one unified workspace over your code, memory, skills, docs, messages, and every data source.

MFS follows a thin‑client + stateful‑server model. The client (Rust‑based mfs CLI, Python/TypeScript SDKs) is stateless and provides commands such as mfs-ingest (register and index a source) and mfs-find (search indexed content). All state lives in mfs-server, which hosts configuration, credentials, task queues, workers, and connectors.

Same verbs, any scheme

Resources are addressed with a URI scheme <scheme>://. The verb set is: ls / tree / cat / head / tail / grep / search Examples: ls github://org/repo – list a GitHub repository search slack://workspace – search Slack history

Search vs Browse

Search (requires indexing): mfs search performs hybrid dense‑vector + BM25 retrieval; mfs grep does exact or full‑text matching without indexing.

Browse (no indexing): ls / tree / cat / head / tail lets agents progressively locate bytes or records.

Each hit returns a locator, e.g. {"lines":[s,e]} for text or a primary‑key dictionary for structured records, enabling precise fetching.

Backend switching by configuration

MFS separates configuration from code. Local data live under $MFS_HOME (default ~/.mfs). Production environments are enabled by swapping configuration values:

Vector store: Milvus Lite → self‑hosted Milvus or Zilliz Cloud

Metadata DB: SQLitePostgres Cache: local filesystem → S3 Embedding engine: local ONNX model → OpenAI / Gemini / Voyage / Ollama

Image description: disabled → OpenAI / Anthropic / Gemini

The design permits a zero‑key, zero‑GPU local start (≈600 MB model download to ~/.mfs/) and a seamless upgrade to a fully managed cloud backend without code changes.

Open Tag: mapping Claude Tag’s three elements

Open Tag reproduces Claude Tag’s Brain‑Memory‑Tools triad with open‑source components:

Brain : Anthropic‑hosted model service → CLI backend claude or codex Memory : Anthropic‑hosted → MFS‑indexed, operator‑authorized context

Tools : Anthropic platform tools → MFS connectors exposing external search and workspace utilities

The Slack bridge ( slack_socket_agent.py) performs five steps:

Receive an app_mention event via Socket Mode.

Fetch the thread with conversations.replies (limit 30).

Post a temporary “working” placeholder reply.

Invoke scripts/opentag_agent.py with the selected backend.

Replace the placeholder with the agent’s answer.

The bridge never answers directly; it only forwards the conversation to a fresh CLI agent process.

Fresh agent per mention

Each @OpenTag mention spawns a new opentag_agent.py process. Backend invocation examples:

claude -p --dangerously-skip-permissions --add-dir <workdir> --add-dir <skill-dir> --add-dir <memory-root>
codex exec --dangerously-bypass-approvals-and-sandbox … --output-last-message <tmp>

(default three retry attempts via OPENTAG_BACKEND_ATTEMPTS)

This stateless approach avoids cross‑conversation state, simplifying debugging at the cost of re‑initialising the model for each request.

Memory boundaries

Agent visibility is limited by the MFS_ALLOWED_SCOPES environment variable. Helper scripts ( mfs_search.py, mfs_cat.py) call is_scope_allowed() before invoking the /v1/search endpoint; disallowed scopes cause exit code 2.

Security posture

This is a demo/reference implementation, not a production security boundary. It lacks a hardened sandbox, multi‑user policy engine, audit system, or approval flow.

MFS provides more than twenty connectors (Postgres, MongoDB, BigQuery, S3, GitHub, Jira, Slack, Discord, Gmail, Notion, etc.) all self‑hosted, keeping data and credentials on‑premise.

Credential design

Connector TOML files store only references, e.g.:

# Only references, real values never stored in config
token = "env:SLACK_BOT_TOKEN"
password = "file:/run/secrets/db_password"

Actual secrets reside in environment variables or mounted files, allowing configuration files to be version‑controlled safely.

Use as a foundation

MFS can be called directly from application code via its Python/TypeScript SDKs, used by the CLI to build custom Skills or MCP servers, and handles indexing pipelines, chunking, embedding, vector store, cache, and connector orchestration, sparing upper‑layer applications from implementing these pieces.

Conclusion

MFS unifies over twenty data sources into a searchable file tree, exposing a single verb set for agents. A zero‑key, zero‑GPU local start upgrades to Zilliz Cloud by changing configuration only. Open Tag builds on MFS to replicate Claude Tag’s Brain‑Memory‑Tools workflow, using a thin Slack bridge and fresh agents per mention, while explicitly stating it is a demo rather than a hardened production security boundary.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

AI agentsMilvusvector searchContext ManagementMFSClaude TagOpen Tag
Shuge Unlimited
Written by

Shuge Unlimited

Formerly "Ops with Skill", now officially upgraded. Fully dedicated to AI, we share both the why (fundamental insights) and the how (practical implementation). From technical operations to breakthrough thinking, we help you understand AI's transformation and master the core abilities needed to shape the future. ShugeX: boundless exploration, skillful execution.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.