Revamping AliGo’s AI Travel Assistant: Multi‑Agent Architecture & Prompt Engineering

The AliGo travel platform upgraded its AI assistant by replacing a single‑agent workflow with a modular multi‑agent system, introducing dynamic prompt generation, real‑time reasoning chains, context sharing, observability, and a knowledge base, which dramatically improved accuracy, stability, and user experience.

Alibaba Cloud Developer
Alibaba Cloud Developer
Alibaba Cloud Developer
Revamping AliGo’s AI Travel Assistant: Multi‑Agent Architecture & Prompt Engineering

Background

AliGo, Alibaba's intelligent travel service, originally used a workflow plus a single smart‑agent model. As business logic grew, prompt token count exploded, causing attention decay in large language models (LLMs) and resulting in low accuracy (~50% for point‑click tasks) and frequent stability errors such as "item recognition error, please retry".

Pain Points

Technical bottlenecks : Excessive prompt tokens reduced model attention and accuracy.

Architecture limits : No inter‑agent communication, poor scalability, and lack of a unified planning core.

Immature context engineering : Mixed data structures and missing hierarchical context sharing increased model load.

Goals

Select an AI framework that supports efficient inter‑agent communication and capability reuse.

Improve LLM accuracy for business tasks, especially intent collection.

Eliminate long‑standing stability bugs.

Enable flexible containerized deployment for diverse client environments.

Solution Overview

Technical Stack Selection

AgentScope was chosen over LangGraph because it provides real‑time control, sandbox execution, an integrated visual tool (AgentStudio), and native support for Qwen models.

Web Framework

FastAPI was selected instead of Flask for its asynchronous capabilities, type safety, and better performance for large‑scale services.

Multi‑Agent Architecture

A hybrid "Handoffs + Routing" pattern was adopted: fixed‑intent requests are routed directly via a rule engine (fast lane), while complex, multi‑intent queries go through the LLM (slow lane). The main planning agent coordinates specialized sub‑agents to form a complete travel‑planning solution.

Intent Recognition Agent

Multi‑intent classification and disambiguation.

Agent scheduling based on predefined rules.

Query rewriting to standardize user input.

Two‑stage output: reasoning process + JSON decision.

def get_prompt_main_plan(user_input: str) -> str:
    # Runtime decision: select prompt template based on rule engine result
    rule_match_result = classifier.classify(user_input)
    if rule_match_result:
        # Rule matched: generate simple intent prompt and route directly
        return _get_simple_intent_prompt(...)
    else:
        # No rule match: generate complex intent prompt and invoke intent_recognition_agent
        return _get_complex_intent_prompt(...)

Real‑Time Reasoning Chain

Using ReAct agents, each reasoning step and tool action is streamed to the user, reducing anxiety during long LLM processing. Hooks intercept tool usage and results, enabling fine‑grained tracking.

async def _acting(self, tool_call: ToolUseBlock) -> Msg | None:
    tool_res_msg = Msg(...)
    try:
        tool_res = await self.toolkit.call_tool_function(tool_call)
        async for chunk in tool_res:
            tool_res_msg.content[0]["output"] = chunk.content
            if (tool_call["name"] != self.finish_function_name or not chunk.metadata.get("success")):
                await self.print(tool_res_msg, chunk.is_last)
            if chunk.is_interrupted:
                raise asyncio.CancelledError()
            if (tool_call["name"] == self.finish_function_name and chunk.metadata and chunk.metadata.get("success")):
                response_msg = chunk.metadata.get("response_msg")
    return response_msg

Prompt Engineering

The original monolithic prompt was replaced by a dynamic prompt assembly mechanism that acts as a state machine. The rule engine classifies user intent at runtime, allowing the system to focus model attention on the current conversation branch and switch prompts accordingly.

while True:
    thought, action, action_input = llm_agent.reason_and_act(observation, history)
    if action == "Finish":
        return thought  # final answer
    else:
        result = execute_tool(action, action_input)
        observation = result
        history.append((thought, action, result))

Context Management

Global context manager for shared state.

Session memory for per‑conversation history.

Dynamic prompt (state machine) generation.

Tool registration and management.

Memory Architecture

Each agent maintains its own dialogue history while a shared session ID enables context sharing across related agents. A layered memory design (global, session, per‑agent) ensures isolation and efficient reuse.

Observability Platform

A full‑stack observability system was built with Langfuse (open‑source). It provides trace, session, and user‑level logs, visual call chains, and metrics such as latency, token consumption, and tool usage.

{
  "type": "tool",
  "id": "{id}",
  "name": "query_tool",
  "input": {"question": "什么是差标管控"}
}

Knowledge Base Integration

MaxKB was selected as the enterprise knowledge base because it matches the Python + PostgreSQL stack, offers document parsing with image handling, supports on‑premise deployment, and provides multi‑tenant isolation.

Evaluation System

An AI evaluation platform based on the Tongyi Qianwen model automatically scores agent outputs for accuracy and relevance, manages test lifecycles, and generates visual reports for version comparison.

Results

Item‑collection accuracy increased from ~50% to over 90%.

All previously unsolvable bugs were resolved (100% fix rate).

Performance and stability improved steadily after the code‑first release in December.

Recognized by InfoQ as a 2025 AI Agent top‑product and awarded the 2025 AI Solution award by QuantumBit.

Future Plans

Continue to automate cold‑start optimization, anomaly detection, and prompt‑tuning agents to build a self‑evolving multi‑agent system, further enhancing intelligence, agility, and reliability.

AgentScope, the open‑source enterprise‑grade agent framework from Tongyi Lab, powers this solution. Repository:

https://github.com/agentscope-ai
LLMKnowledge Basemulti-agent systemsAI ArchitectureAgentScope
Alibaba Cloud Developer
Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.