OpenAI’s Latest Open‑Source Releases: Codex CLI, Plugins, Symphony, and Privacy‑Filter
OpenAI has recently open‑sourced three projects—Codex CLI, the openai/plugins repository, the engineering‑preview Symphony orchestration service, and the privacy‑filter model—detailing installation, plugin architecture, workflow orchestration design, and usage examples, while comparing them to competing agents and noting practical constraints.
OpenAI Open‑Source Projects
1. Codex CLI
Repository: github.com/openai/codex (Apache‑2.0, Rust).
Installation commands:
npm install -g @openai/codex brew install --cask codexAfter installation run codex, select “Sign in with ChatGPT”. Any ChatGPT subscription tier (Plus, Pro, Business, Edu, Enterprise) can be used without a separate API key; an API key is also supported.
The CLI runs locally, interacts via the terminal, and can read/write files, similar to Anthropic’s Claude Code. The billing model differs: Codex uses a ChatGPT subscription, while Claude Code is billed per API call.
2. openai/plugins – Codex Plugin Ecosystem
Repository: github.com/openai/plugins. The repository contains dozens of example plugins that expose Codex capabilities to external SaaS services.
Each plugin follows a fixed directory layout:
plugins/<name>/
├── .codex-plugin/plugin.json # required manifest
├── skills/ # optional skill definitions
├── .app.json # optional
├── .mcp.json # optional MCP integration
├── agents/ # optional
├── commands/ # optional
├── hooks.json # optional
└── assets/The .mcp.json file bundles MCP services, allowing a single plugin to provide skills, commands, agents, hooks, and MCP tools together.
Over 100 third‑party plugins are available, covering domains such as:
Design/Frontend – figma, canva, remotion, cloudinary, biorender
Collaboration/Office – notion, slack, teams, linear, monday‑com, clickup, gmail, outlook‑email, google‑calendar, google‑drive, sharepoint
Development/Deployment – github, netlify, vercel, cloudflare, render, circleci, sentry, coderabbit, neon‑postgres, temporal, quicknode
Data analysis – amplitude, statsig, cube, motherduck, omni‑analytics, coupler‑io
Finance/Payments – stripe, razorpay, brex, binance, carta‑crm, pitchbook, moody‑s, morningstar
AI‑related – hugging‑face, chatgpt‑apps, superpowers, plugin‑eval
Officially recommended plugins include: plugins/figma: use‑figma, Code to Canvas, Code Connect, design‑system rules plugins/notion: planning, research, meeting, knowledge capture plugins/build-ios-apps / plugins/build-macos-apps: SwiftUI/AppKit workflow plugins/build-web-apps: deployment, UI, payment, database workflow plugins/expo: Expo and React Native apps, SDK upgrades, EAS workflow
This repository turns Codex from a single coding agent into an open platform where developers can add new tools by following the template, enabling natural‑language calls such as creating a Stripe invoice or opening a Linear issue.
3. openai/symphony – Orchestrating Codex Agents
Repository: github.com/openai/symphony (Apache‑2.0, engineering preview, not intended for production).
Problem statement: using Codex or Claude Code requires a developer to watch the IDE, feed tasks, monitor tests, and review PRs, limiting parallelism.
Symphony converts the workflow into a long‑running daemon that polls a Linear board, spawns isolated workspaces, runs Codex agents, and records proof of work (CI status, PR feedback, walkthrough video). Engineers only move tickets from “Human Review” to “Done”.
Engineers do not need to supervise Codex; they can manage the work at a higher level.
Architecture (components):
Workflow Loader : reads WORKFLOW.md, parses YAML front‑matter and prompt template.
Config Layer : strong‑typed configuration, environment‑variable resolution, pre‑run validation.
Issue Tracker Client : pulls issues, polls status, normalises data (currently only Linear).
Orchestrator : main scheduling loop, decides dispatch, retry, stop, release.
Workspace Manager : creates an isolated directory per issue, runs lifecycle hooks.
Agent Runner : launches a Codex app‑server subprocess, builds prompts, forwards events.
Status Surface : optional human‑readable status (terminal/dashboard).
Logging : structured logs.
Layered design: Strategy (WORKFLOW.md) → Config → Coordination (Orchestrator) → Execution (workspace + agent subprocess) → Integration (Linear adapter) → Observation.
Core contract lives in WORKFLOW.md, a YAML front‑matter block followed by a Markdown prompt. Example snippet:
---
tracker:
kind: linear
api_key: $LINEAR_API_KEY
project_slug: my-project
active_states: [Todo, In Progress]
terminal_states: [Done, Cancelled, Duplicate]
polling:
interval_ms: 30000
workspace:
root: ~/symphony_workspaces
hooks:
after_create: |
git clone [email protected]:org/repo.git .
before_run: |
npm install
after_run: |
npm run cleanup
agent:
max_concurrent_agents: 10
max_turns: 20
max_concurrent_agents_by_state:
"in progress": 5
codex:
command: codex app-server
approval_policy: never
thread_sandbox: workspace-write
turn_timeout_ms: 3600000
stall_timeout_ms: 300000
---
You are working on Linear issue {{ issue.identifier }}: {{ issue.title }}
Description:
{{ issue.description }}
Steps:
1. Read the description and understand the requirements.
2. Implement the change in the workspace.
3. Run tests.
4. Open a PR and move the ticket to "Human Review".Key design points:
Per‑issue workspace isolation : each issue gets its own directory named after a sanitized issue.identifier. Hooks such as after_create run once (e.g., git clone), while before_run runs before each execution (e.g., npm install).
Concurrency control : global limit max_concurrent_agents: 10 with optional per‑state limits (e.g., {"in progress": 5}). Excess tasks queue for later dispatch.
Retry mechanism : exponential back‑off with configurable max_retry_backoff_ms (default 5 min). Failed tasks enter a retry queue with monotonic timers.
Status reconciliation : each poll tick checks the Linear ticket state; if it reaches a terminal state, the corresponding agent run is stopped.
No database requirement : after a restart, state is recovered from the Linear tracker and the filesystem; in‑memory scheduling state is lost but the source‑of‑truth remains on Linear.
Codex app‑server protocol : the Agent Runner launches codex app-server via bash -lc; STDIO follows the app‑server protocol. Supported sandbox configurations can be listed with codex app-server generate-json-schema --out <dir>.
Usage paths provided by the repository:
Path 1 : let a coding agent generate a language‑specific implementation from the 80 KB SPEC.md.
Path 2 : use the official Elixir reference implementation ( elixir/ directory). Elixir/OTP is well‑suited for long‑running, highly concurrent orchestration, though the community barrier is higher.
Limitations noted in the source:
Requires “harness engineering”: codebases, tests, and CI must be adapted for agent interaction.
Currently supports only Linear as an issue tracker; other trackers need custom adapters.
Strongly bound to Codex; swapping agents would require protocol changes.
Marked as an engineering preview; not recommended for production environments.
4. openai/privacy-filter – PII Detection and Redaction Model
Model hosted at https://huggingface.co/openai/privacy-filter.
Key specifications:
Total parameters: 1.5 B; active MoE parameters: 50 M (128 experts, top‑4 routing)
Context length: 128 k tokens
Architecture: 8‑layer transformer, 14 query heads + 2 KV heads (GQA), d_model = 640 Runs on browsers and laptops (CPU‑friendly)
Architecture builds on the earlier OpenAI gpt‑oss series. After autoregressive pre‑training, the language‑model head is replaced with a token‑classification head and fine‑tuned as a bidirectional token classifier.
The model tags each token with one of eight private entity types (account_number, private_address, private_email, private_person, private_phone, private_url, private_date, secret) using BIOES labeling plus an “O” background class, yielding 33 output classes. Decoding uses a constrained Viterbi algorithm for coherent spans.
Python usage example:
from transformers import pipeline
classifier = pipeline(
task="token-classification",
model="openai/privacy-filter",
)
classifier("My name is Alice Smith")JavaScript (Transformers.js) example (WebGPU + q4 quantisation):
import { pipeline } from "@huggingface/transformers";
const classifier = await pipeline(
"token-classification",
"openai/privacy-filter",
{ device: "webgpu", dtype: "q4" }
);
const input = "My name is Harry Potter and my email is [email protected].";
const output = await classifier(input, { aggregation_strategy: "simple" });Sample output:
[
{ entity_group: 'private_person', score: 0.99999, word: ' Harry Potter' },
{ entity_group: 'private_email', score: 0.99999, word: ' [email protected]' }
]Typical use cases: log redaction, training‑data cleaning, internal document sanitisation, pre‑filtering customer‑service dialogues.
The model’s 50 M active parameters enable CPU inference; precision‑recall trade‑offs can be tuned via decoding parameters. It is primarily English‑focused; multilingual robustness is limited and Chinese performance is not guaranteed. For Chinese scenarios, fine‑tuning is required, which is feasible given the modest model size.
OpenAI notes that the model is an auxiliary data‑minimisation tool, not a compliance‑grade anonymisation solution; production deployments should add additional rule‑based safeguards.
Old Zhang's AI Learning
AI practitioner specializing in large-model evaluation and on-premise deployment, agents, AI programming, Vibe Coding, general AI, and broader tech trends, with daily original technical articles.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
