Operations 14 min read

Seamless Cross‑Domain Connections in Hermes Agent via Gateway Boundary Separation

Hermes introduces a layered Gateway architecture that cleanly separates entry points—CLI, messaging platforms, and HTTP—from the core AIAgent, enabling stable reuse across multiple channels while handling streaming adaptation, session routing, approvals, execution isolation, and deployment packaging in a unified control plane.

AI Step-by-Step
AI Step-by-Step
AI Step-by-Step
Seamless Cross‑Domain Connections in Hermes Agent via Gateway Boundary Separation

Problem: fragmented agents

Agents often have separate code paths for CLI, Discord, Telegram, each maintaining its own polling, permissions, sessions, and message formats, which leads to drift and duplicated effort.

Solution: Hermes Gateway

Hermes introduces a dedicated Gateway that isolates entry‑point handling from the core AIAgent, enabling a single cohesive agent reused across CLI, chat platforms and HTTP/Webhook.

Six‑layer architecture

Entry layer : CLI/TUI, Messaging Gateway (Telegram, Discord, WeCom, Weixin, Slack, QQ), HTTP entry (API Server, Webhook).

Streaming adaptation layer : wraps a single answer into stdout token stream, OpenAI‑compatible SSE, or progressive message‑edit updates with configurable edit interval and buffer threshold.

Unified control plane : platform adapters, deterministic session routing, approval handling, interruption, cron scheduling, result delivery.

Unified core : AIAgent reuses the same prompt, memory, skills, tools and approvals.

State & identity layer : SQLite/JSONL session stores, MEMORY.md, USER.md, skills and profiles.

Execution & deployment boundary : terminal backend (local, docker, singularity, modal, daytona) defines side‑effect isolation; deployment packaging (foreground, systemd, launchd, Docker, profiles, webhook mode) defines service shape.

Entry points are shells

CLI/TUI interacts with the local directory and terminal; Messaging Gateway handles token handling, mention rules, thread isolation and authentication for each chat platform; HTTP entry exposes OpenAI‑compatible /v1/chat/completions and /v1/responses and performs HMAC verification. All three forward messages to the same AIAgent without influencing its business logic.

Streaming adaptation determines answer delivery

CLI streams tokens directly to stdout for minimal latency. API Server, when stream: true, rewrites the token stream into strict SSE blocks and embeds tool‑progress events. Chat platforms (Discord, Telegram, Slack) cannot emit a token per request; Hermes buffers tokens and sends batched message edits according to transport: edit, edit interval and buffer threshold. Platforms without edit support receive a single final result.

Unified control plane

Gateway runs as a persistent background process that connects configured platforms, maintains per‑chat session stores, runs scheduled tasks and routes results back to the originating channel. Its responsibilities include:

Platform integration (Telegram, Discord, WeCom, Weixin, Webhook, etc.)

Deterministic session key generation: private chat → chat_id; group chat → chat_id+user_id; thread → additionally thread_id Unified execution: all messages enter a single Agent loop, sharing memory, skills, toolsets and approval mechanisms

Result delivery: text, audio, files, background‑task results and cron messages are sent to the correct target

Approval flow example

Bot: “About to execute rm -rf /workspace/tmp, requires approval.”

Bot prompts for yes / no or /approve / /deny. User replies yes. The system pauses the original session, records the approval, then resumes execution.

Unified core (AIAgent)

The core resides in run_agent.py. CLI invokes AIAgent directly; Gateway uses GatewayRunner to forward platform messages; API Server forwards OpenAI‑compatible requests to the same toolchain. Changes to prompts, memory injection, skills or approvals are made once in the core and affect all entry points.

State, memory and profile isolation

State & identity layer stores sessions in SQLite or JSONL, long‑term memory in MEMORY.md, user data in USER.md, and profiles contain configuration, tokens and skill definitions. Profiles enforce token‑lock isolation so two profiles cannot share the same bot credentials, but they do not automatically isolate command side effects.

Execution backend isolation

Backend choice determines side‑effect boundaries: local: commands run on the host without isolation. docker, singularity, modal, daytona: provide sandboxing with read‑only root filesystem, dropped capabilities, no privilege escalation, PID limits and isolated namespaces.

In persistent mode Docker binds workspaces to ~/.hermes/sandboxes/docker/<task_id>/; Modal and Daytona use task‑ or sandbox‑specific workspaces. Production deployments should therefore use a sandbox backend; otherwise distinct session keys (e.g., Alice vs Bob on Discord) do not prevent host‑level side‑effect interference.

Deployment packaging and asynchronous result handling

Deployment packaging determines how the agent is served: foreground, systemd, launchd, Docker, profiles or webhook mode. Webhook adapters can receive external events, run the agent and post results to GitHub, Telegram, Discord, Slack, etc. Cron jobs create new sessions, execute tasks and return results via the same routing logic. A REST endpoint /api/jobs manages cron jobs.

Production‑grade gateway relies on layered control

Hermes separates transport (streaming adapter), conversation state (deterministic session key), approval/interrupt handling (pending approval, GatewayRunner) and side‑effect isolation (terminal backend). Each layer processes only its own concerns, providing a stable, reusable agent architecture that can be extended to enterprise‑grade platforms.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

AgentGatewayHermesSession ManagementExecution IsolationMulti‑Platform IntegrationStreaming Adapter
AI Step-by-Step
Written by

AI Step-by-Step

Sharing AI knowledge, practical implementation records, and more.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.