Artificial Intelligence 20 min read

AI Agents: Concepts, Key Components, and Development Frameworks

AI agents extend large language models with planning, short‑term and long‑term memory, and tool‑use capabilities, enabling autonomous task decomposition, external API interaction, and persistent knowledge retrieval; frameworks such as MetaGPT, LangChain, and CrewAI simplify building agents like a researcher that gather information, browse web content, and generate reports, heralding broader AI‑enhanced productivity.

Tencent Cloud Developer

May 28, 2024

AI Agents: Concepts, Key Components, and Development Frameworks

In the era of rapid AI development, large language models (LLMs) have shown impressive capabilities in understanding input, reasoning, and generating output. However, unlike humans, LLMs lack planning, memory, and tool‑use abilities, which limits their practical applicability.

An AI agent is defined as a general problem‑solver built on top of an LLM, equipped with planning, memory, and tool‑use capabilities, enabling it to autonomously complete assigned tasks.

1. Large Language Model vs. Human

LLMs can accept input, reason, and produce text, code, or media, but they do not possess the human‑like abilities to plan, remember, or interact with physical tools.

2. What Is an Agent?

An agent (or "Agent") is a software program that leverages an LLM as its "brain" while adding three essential components:

Planning : Decompose complex tasks into subtasks, devise execution order, and reflect on progress.

Memory : Short‑term memory stores intermediate context during a task; long‑term memory (often a vector database) retains knowledge across sessions.

Tool Use : APIs such as calculators, search engines, code executors, or database queries allow the agent to interact with the external world.

3. What Can Agents Do?

The article illustrates a researcher agent built with the MetaGPT framework that can automatically gather information about a topic and generate a research report.

Running the researcher agent (example command):

~ python3 -m metagpt.roles.researcher "特斯拉FSD vs 华为ADS"

The agent performs the following steps:

Collects URLs related to the query (tool CollectLinks).

Browses each URL and summarizes content (tool WebBrowseAndSummarize).

Generates a final report (tool ConductResearch).

The generated report is saved as 特斯拉FSD vs 华为ADS.md.

4. Key Components of an Agent

4.1 Planning

Planning mirrors human problem‑solving: think about the goal, examine available tools, break the goal into subtasks, execute while reflecting, and decide when to stop.

Sub‑task decomposition is essential for handling large tasks.

4.1.1 Chain‑of‑Thought (CoT)

CoT prompts such as

"Answer the question: Q: {question}? Let's think step by step:"

encourage the LLM to reason step‑by‑step, improving accuracy on complex problems.

4.1.2 Tree‑of‑Thought (ToT)

ToT extends CoT by exploring multiple reasoning branches and using search algorithms (BFS/DFS) to evaluate and select the best path.

4.2 Memory

Agents implement two memory types:

Short‑term memory : Stores context generated during a task and is cleared after completion.

Long‑term memory : Persistent knowledge base (often a vector database) used for retrieval across tasks.

4.3 Tool Use

Agents treat external functionalities as functions . Function calling enables LLMs to request tool execution via JSON‑encoded arguments.

Example function definition for a weather‑forecast tool:

tools = [{"type": "function", "function": {"name": "get_n_day_weather_forecast", "description": "获取最近n天的天气预报", "parameters": {"type": "object", "properties": {"location": {"type": "string", "description": "城市或镇区 如：深圳市南山区"}, "format": {"type": "string", "enum": ["celsius", "fahrenheit"], "description": "要使用的温度单位，摄氏度 or 华氏度"}, "num_days": {"type": "integer", "description": "预测天数"}}, "required": ["location", "format", "num_days"]}}}]

Typical OpenAI SDK workflow:

from openai import OpenAI

def chat_completion_request(messages, tools=None, tool_choice=None, model="gpt-3.5-turbo"):
    try:
        response = client.chat.completions.create(
            model=model,
            messages=messages,
            tools=tools,
            tool_choice=tool_choice,
        )
        return response
    except Exception as e:
        print("Unable to generate ChatCompletion response")
        print(f"Exception: {e}")
        return e

if __name__ == "__main__":
    messages = []
    messages.append({"role": "system", "content": "不要假设将哪些值输入到函数中。如果用户请求不明确，请要求澄清"})
    messages.append({"role": "user", "content": "未来5天深圳南山区的天气怎么样"})
    chat_response = chat_completion_request(messages, tools=tools)
    tool_calls = chat_response.choices[0].message.tool_calls
    print("===回复===")
    print(tool_calls)

The LLM returns the function name ( get_n_day_weather_forecast) and arguments, which the caller executes and feeds the result back to the model for a natural‑language response.

5. Development Frameworks for Agents

As of May 2024, many open‑source and commercial frameworks (e.g., MetaGPT, LangChain, CrewAI) abstract common modules such as memory, planning, retrieval‑augmented generation (RAG), and LLM invocation, allowing rapid construction of agents.

MetaGPT is highlighted as a multi‑agent framework where distinct roles (coder, tester, reviewer) collaborate to deliver software projects.

6. Outlook

With LLMs gaining longer context windows, larger parameter counts, and stronger reasoning, AI agents will continue to break new ground. They will power applications like Copilot, DB‑GPT, and many emerging AI‑enhanced workflows, reshaping software development and human productivity.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

AI agents frameworks Function Calling Tool Use Planning

Written by

Tencent Cloud Developer

Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.