Unlocking AI Agents: From Fundamentals to Building Your First LLM‑Powered Agent
This comprehensive guide explores the concept of AI agents, detailing their definitions, classifications, and core interaction loops, then walks you through building a functional LLM‑driven travel assistant with step‑by‑step code, tool integration, and practical insights on agent versus workflow paradigms.
1. Introduction to AI Agents
Agents are autonomous entities that perceive their environment through sensors, act via actuators, and pursue goals with varying degrees of autonomy. Traditional agents evolved from simple reflex systems to model‑based, goal‑oriented, utility‑based, and learning agents, culminating in modern large‑language‑model (LLM) agents.
1.1 Types of Agents
Simple Reflex Agent : Direct sensor‑to‑action mapping, fast but short‑sighted.
Model‑Based Reflex Agent : Maintains an internal world model for memory.
Goal‑Based Agent : Plans actions to achieve specific objectives.
Utility‑Based Agent : Maximizes expected utility when multiple goals conflict.
Learning Agent : Improves its policy through reinforcement learning (RL).
1.2 LLM‑Driven Agents
LLM agents differ from traditional ones by leveraging massive pre‑training to acquire implicit world knowledge and emergent reasoning abilities. They process high‑level natural language instructions, decompose tasks, select tools, and iteratively refine actions.
1.3 Core Interaction Loop
The agent operates in a continuous Perception → Thought → Action → Observation cycle, as illustrated below:
1.4 Task Environment (PEAS Model)
Agents are described using the PEAS framework (Performance, Environment, Actuators, Sensors). For a travel‑assistant example, performance measures include user satisfaction, the environment includes weather APIs and travel data, actuators are API calls, and sensors are user inputs and API responses.
1.5 Thought‑Action‑Observation Protocol
LLM agents output structured text:
Thought: ...
Action: function_name(arg="value")Observations are returned as natural‑language summaries, feeding back into the next cycle.
2. Hands‑On Example: Building a 5‑Minute Travel Assistant
2.1 Preparation
Install required Python packages:
pip install requests tavily-python openai2.2 Define Tools
Weather Tool using wttr.in API:
def get_weather(city: str) -> str:
url = f"https://wttr.in/{city}?format=j1"
try:
r = requests.get(url)
r.raise_for_status()
data = r.json()
cur = data['current_condition'][0]
return f"{city} 当前天气:{cur['weatherDesc'][0]['value']},气温 {cur['temp_C']}℃"
except Exception as e:
return f"错误:{e}"Attraction Search Tool using Tavily search API:
def get_attraction(city: str, weather: str) -> str:
api_key = os.getenv('TAVILY_API_KEY')
if not api_key:
return "错误:未配置 TAVILY_API_KEY"
client = TavilyClient(api_key=api_key)
query = f"{city} 在 {weather} 天气下值得去的旅游景点"
try:
resp = client.search(query=query, search_depth="basic", include_answer=True)
if resp.get('answer'):
return resp['answer']
results = [f"- {r['title']}: {r['content']}" for r in resp.get('results', [])]
return "
".join(results) if results else "未找到景点"
except Exception as e:
return f"错误:{e}"Collect tools in a dictionary:
available_tools = {"get_weather": get_weather, "get_attraction": get_attraction}2.3 Prompt Engineering
AGENT_SYSTEM_PROMPT = """
You are an intelligent travel assistant. Use the available tools:
- `get_weather(city)`
- `get_attraction(city, weather)`
Respond with:
Thought: ...
Action: ...
When finished, use `finish(answer="...")`.
"""2.4 LLM Client (OpenAI‑compatible)
class OpenAICompatibleClient:
def __init__(self, model, api_key, base_url):
self.client = OpenAI(api_key=api_key, base_url=base_url)
self.model = model
def generate(self, user_prompt, system_prompt):
messages = [
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_prompt}
]
resp = self.client.chat.completions.create(model=self.model, messages=messages)
return resp.choices[0].message.content2.5 Main Loop
llm = OpenAICompatibleClient(model=MODEL_ID, api_key=API_KEY, base_url=BASE_URL)
user_prompt = "你好,请帮我查询一下今天北京的天气,然后根据天气推荐一个合适的旅游景点。"
history = [f"User: {user_prompt}"]
for _ in range(5):
prompt = "
".join(history)
output = llm.generate(prompt, AGENT_SYSTEM_PROMPT)
history.append(output)
m = re.search(r"Action: (.*)", output, re.DOTALL)
if not m:
break
action = m.group(1).strip()
if action.startswith("finish"):
ans = re.search(r'finish\(answer="(.*)"\)', action).group(1)
print("Final answer:", ans)
break
tool_name = re.search(r"(\w+)\(", action).group(1)
args = dict(re.findall(r"(\w+)=\"([^\"]*)\"", action))
observation = available_tools.get(tool_name, lambda **_: "未知工具")(**args)
history.append(f"Observation: {observation}")
"""The loop demonstrates the Thought‑Action‑Observation cycle, allowing the agent to fetch weather, search attractions, and finally produce a concise answer.
3. Agent vs. Workflow Paradigms
Workflow defines a static sequence of tasks with predetermined branching, suitable for repeatable processes such as expense approvals.
Agent embodies a goal‑oriented, autonomous system that perceives, reasons, plans, and adapts dynamically, as shown by the travel‑assistant example.
Figure below contrasts the two approaches:
4. Conclusion
The article introduced the definition, taxonomy, and operational loop of AI agents, provided a practical LLM‑driven implementation, and clarified the distinction between static workflows and autonomous agents, laying a solid foundation for further exploration of advanced multi‑agent frameworks.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Data Party THU
Official platform of Tsinghua Big Data Research Center, sharing the team's latest research, teaching updates, and big data news.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
