Create a Real‑Time AI Voice Assistant on Alibaba Cloud in Minutes
Learn how to quickly build a custom AI voice assistant using Alibaba Cloud's ARTC real‑time audio‑video service, integrate it into websites or mobile apps with a few JavaScript lines, and customize workflows, agents, and private knowledge bases through the IMS and Function Compute platform.
According to this tutorial, you can quickly build a dedicated AI agent through a white‑screen interface and achieve real‑time AI voice interaction via the ARTC network, providing an instant, natural conversational experience.
Why Choose AI Real‑Time Voice Interaction?
Ultra‑human experience: Only 1.5 s latency for smooth voice exchange.
Supports intelligent noise cancellation, semantic recognition, sentence optimization, high‑fidelity voice tones, and digital avatars.
Flexible agent orchestration: Console offers white‑screen operations to easily integrate AI components (ASR, TTS, digital humans, LLMs, etc.) and rapidly build enterprise‑specific cloud AI agents.
Open AI ecosystem: Built‑in Alibaba Bailei platform capabilities plus third‑party plugins and custom models.
High‑quality low‑latency calls: Global ARTC nodes and QoS policies ensure superior audio‑video quality worldwide.
Overall Architecture
AI real‑time interaction is an end‑to‑end audio‑video communication between a user and a cloud AI agent.
Workflow:
User initiates an audio‑video call request.
The AI agent receives the stream, triggers a workflow to process the request.
The agent generates an audio‑video response stream and pushes it through the ARTC network.
User receives and plays the response, completing a natural dialogue.
The AI agent is the core component, created via the Intelligent Media Service (IMS) and executed through plug‑and‑drag workflow composition, incorporating speech‑to‑text, large language models, text‑to‑speech, and vector databases. Real‑time audio‑video (ARTC) guarantees high availability, high quality, and ultra‑low latency. Web services are deployed with Function Compute (FC).
Key Technical Modules
Intelligent Media Service (IMS): Creates AI agents and workflows.
Real‑time Audio‑Video (ARTC): Provides over 3,200 global nodes for low‑latency, high‑quality communication.
Function Compute (FC): Hosts lightweight web services.
Experience Tutorial
Create a real‑time audio‑video communication app using ARTC.
Create an AI agent via IMS.
Arrange a real‑time workflow that includes STT, TTS, LLM, etc.
Deploy the application with Function Compute .
Test the experience on web or mobile.
Web Integration Example
Insert the following JavaScript snippet into your web page (requires HTTPS and a valid SSL certificate):
<!-- Container for rendering ARTC AI Call UI -->
<div id="root"></div>
<!-- Include ARTC AI Call UI script -->
<script src="https://g.alicdn.com/apsara-media-aui/amaui-web-aicall/1.6.2/aicall-ui.js"></script>
<!-- Initialize and render the UI -->
<script>
new ARTCAICallUI({
userId: 'id',
root: document.getElementById('root'),
appServer: 'https://<url>',
agentType: 0,
userToken: 'token'
}).render();
</script>Parameter description:
userId : String identifier required by your business logic.
root : DOM element where the UI is rendered.
appServer : URL of the AI real‑time voice interaction service (the Function Compute endpoint).
agentType : Call type – 0 for voice, 1 for digital‑human, 2 for video understanding.
userToken : Optional authentication token; if provided, it must be non‑empty.
Mobile Integration
The demo supports QR‑code scanning via WeChat/DingTalk or opening the link directly in a mobile browser.
After opening the page, click "Show Details" and then "Visit Site" to start the AI real‑time voice conversation on your mobile device.
Personalize Agent Persona
Modify existing workflow templates and preset different scenarios or agent personas to achieve customized experiences.
Connect Private Knowledge Base
Enhance the agent's expertise in specific domains by creating a knowledge base and RAG application on the Bailei platform, then configure the integration.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Developer
Alibaba's official tech channel, featuring all of its technology innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
