Artificial Intelligence 14 min read

Controlling LLM‑Based AI Agents with the Open‑Source ‘Agents’ Framework

This article introduces the experimental open‑source project ‘Agents’, explains common challenges of LLM‑based AI agents, compares it with tools like AutoGPT, LangChain and MetaGPT, and demonstrates how its configuration‑driven SOP approach enables more controllable, multi‑agent interactions and easier deployment.

AI Large Model Application Practice

Oct 4, 2023

Controlling LLM‑Based AI Agents with the Open‑Source ‘Agents’ Framework

Background

Autonomous AI agents built on large language models (LLMs) have become a hot research focus because they can understand natural language and perform a wide range of digital tasks, hinting at the path toward artificial general intelligence (AGI). The open‑source community offers many agent‑oriented projects, which can be divided into two categories.

Common Types of Projects

Ready‑to‑use agents : AutoGPT, BabyAGI, Generative Agents, Web Agent, MetaGPT – these aim to perform specific tasks out‑of‑the‑box but provide limited extensibility.

Frameworks and toolkits : LangChain, Camel, etc. – they expose SDKs for developers to build custom agents but require substantial technical knowledge.

Typical Problems

Many projects are merely demos and are not mature platforms for building or customizing agents.

Frameworks such as LangChain are powerful yet complex, demanding extensive development, debugging, and testing.

Some open‑source agents focus only on a single core capability (e.g., task planning, long‑term memory, or tool use) and lack overall completeness.

Agents often rely on short natural‑language prompts, making their behavior unpredictable and hard to debug.

Introducing the Agents Project

The experimental open‑source project named Agents claims to address the above pain points by offering a more generic, concise, and controllable way to construct LLM‑based agents. It provides essential capabilities such as short‑ and long‑term memory, tool usage, and web search.

Key Features

Simplified construction : A single config.json file can define a complete agent for a specific scenario, with optional Web UI for editing.

Multi‑agent collaboration : Supports multiple agents that can cooperate, with LLM‑driven control over each agent’s actions based on the current task stage.

Human‑agent interaction : Humans can assume roles within multi‑agent scenarios, actively participating in tasks such as debates or game missions.

SOP‑based controllability : Standard Operating Procedures (SOP) allow precise definition of sub‑tasks, roles, prompts, rules, and examples, making the execution process deterministic and easier to debug.

Agents also implements basic components like short‑term memory, long‑term memory, tool usage, and web search.

Simple Dialogue Agent Demo

A minimal dialogue agent built with Agents follows three steps:

Use the LLM to understand the user query and decide whether a web search is needed, then generate search keywords.

Invoke a search engine (e.g., Bing) to retrieve relevant information.

Feed the retrieved information back to the LLM for summarization and response.

This demo showcases core abilities such as LLM‑driven planning, tool integration, and short‑term conversational memory.

Configuration Details

When building an agent with Agents, the configuration file must specify:

The involved AI roles and which role (if any) represents the human.

The LLM model and required API parameters (URL, key, etc.).

SOP definition , which includes:

Sub‑tasks (States) for each stage of the overall task.

Roles assigned to each State, together with their prompts, language style, rules, and examples.

Sequencing and transition rules between States, culminating in an end_state that signals task completion.

Below is a visual example of a simple SOP configuration for a single‑agent dialogue scenario.

Comparison with Other Frameworks

MetaGPT can also build multi‑agent systems with relatively simple code, but it offers limited fine‑grained control over the execution flow. Agents, by contrast, emphasizes SOP‑driven determinism, allowing developers to design detailed task processes.

Software‑Company Agent Case Study

The article presents a more complex example: a simulated software‑development company composed of multiple agents (designer, programmer, debugger, etc.). The SOP divides the software development lifecycle into design, development, and debugging states, each with specific roles and prompts. Running the configuration demonstrates a step‑by‑step progression through these states.

Current Limitations and Future Work

Agents mainly excel at text‑centric tasks (code generation, script writing, design) and lack extensive support for operation‑type tasks that require direct system control.

The configuration‑first approach simplifies usage but sacrifices flexibility and extensibility for complex enterprise scenarios.

Compared with mature frameworks like LangChain, Agents still offers limited integration with diverse LLMs, embedding models, vector stores, external APIs, and prompt‑engineering patterns.

Conclusion

Agents demonstrates that a configuration‑driven SOP methodology can make LLM‑based AI agents more controllable and easier to deploy, especially for multi‑agent collaborations and human‑in‑the‑loop scenarios. While still experimental, the project points toward a future where autonomous agents can be reliably orchestrated for complex tasks.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

LLM Configuration open source SOP Multi-Agent

Written by

AI Large Model Application Practice

Focused on deep research and development of large-model applications. Authors of "RAG Application Development and Optimization Based on Large Models" and "MCP Principles Unveiled and Development Guide". Primarily B2B, with B2C as a supplement.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.