Build a Voice‑Enabled Chatbot in Python with Baidu AI and Qingyunke
Learn how to create a Python program that captures spoken input, converts it to text using Baidu's speech‑recognition API, sends the text to the free Qingyunke chatbot for intelligent replies, and then synthesizes the response back into speech, with complete code snippets and setup instructions.
Brief Overview
In the past two days I needed to develop a small Python program that enables a human‑machine conversation using an intelligent dialogue interface. The goal is to achieve voice‑based interaction where a user can speak, the system recognizes the speech, obtains a textual query, sends it to a chatbot, and finally plays the chatbot's answer as speech.
Overall Idea
Computer receives the user's voice input .
The voice input is converted into text .
The text is sent to an intelligent dialogue API, which returns a smart answer in text .
The answer text is converted back to voice format for playback.
Required Environment
pip install pyaudio for recording and generating wav files
pip install baidu-aip Baidu AI SDK for speech‑to‑text
pip install pyttsx3 for converting text to speech
Receive User Voice Input and Save as Audio File
import time</code><code>import wave</code><code>from pyaudio import PyAudio, paInt16</code><code>framerate = 16000 # sample rate</code><code>num_samples = 2000 # frames per buffer</code><code>channels = 1</code><code>sampwidth = 2 # 2 bytes</code><code>FILEPATH = '../voices/myvoices.wav' # ensure directory exists</code><code>class Speak():</code><code> # Save audio data to wav file</code><code> def save_wave_file(self, filepath, data):</code><code> wf = wave.open(filepath, 'wb')</code><code> wf.setnchannels(channels)</code><code> wf.setsampwidth(sampwidth)</code><code> wf.setframerate(framerate)</code><code> wf.writeframes(b''.join(data))</code><code> wf.close()</code><code> # Record voice</code><code> def my_record(self):</code><code> pa = PyAudio()</code><code> stream = pa.open(format=paInt16, channels=channels, rate=framerate, input=True, frames_per_buffer=num_samples)</code><code> my_buf = []</code><code> t = time.time()</code><code> print('正在讲话...')</code><code> while time.time() < t + 5: # record for 5 seconds</code><code> string_audio_data = stream.read(num_samples)</code><code> my_buf.append(string_audio_data)</code><code> print('讲话结束')</code><code> self.save_wave_file(FILEPATH, my_buf)</code><code> stream.close()Call Baidu AI Interface for Speech Recognition
First, create an account on the Baidu AI Open Platform, enable the Speech Recognition service, and obtain AppID , API Key , and Secret Key . These credentials are used to call the API.
After the credentials are ready, the following code sends the recorded wav file to Baidu for recognition.
from aip import AipSpeech</code><code>APP_ID = '25990397'</code><code>API_KEY = 'iS91n0uEOujkMIlsOTLxiVOc'</code><code>SECRET_KEY = '' # fill your secret key</code><code>client = AipSpeech(APP_ID, API_KEY, SECRET_KEY)</code><code>class ReadWav():</code><code> def get_file_content(self, filePath):</code><code> with open(filePath, 'rb') as fp:</code><code> return fp.read()</code><code> def predict(self):</code><code> return client.asr(self.get_file_content('../voices/myvoices.wav'), 'wav', 16000, {'dev_pid': 1537})</code><code>readWav = ReadWav()</code><code>print(readWav.predict())Sample output:
{'corpus_no': '7087884083428433929', 'err_msg': 'success.', 'err_no': 0, 'result': ['你叫什么名字呀?'], 'sn': '255158586831650276613'}Request Intelligent Robot and Get Reply
A free, no‑registration chatbot service called Qingyunke can be called with a simple GET request.
def talkWithRobot(msg):</code><code> url = 'http://api.qingyunke.com/api.php?key=free&appid=0&msg={}'.format(urllib.parse.quote(msg))</code><code> html = requests.get(url)</code><code> return html.json()["content"]</code><code>print(talkWithRobot("你好呀!"))Typical response: 哟~ 都好都好
Convert Answer to Speech and Play
import pyttsx3</code><code>class RobotSay():</code><code> def __init__(self):</code><code> self.engine = pyttsx3.init()</code><code> self.rate = self.engine.getProperty('rate')</code><code> self.engine.setProperty('rate', self.rate - 50)</code><code> def say(self, msg):</code><code> self.engine.say(msg)</code><code> self.engine.runAndWait()</code><code>robotSay = RobotSay()</code><code>robotSay.say("你好呀") # plays the text as speechCombine Everything into an Automatic Chatbot
def talkWithRobot(msg):</code><code> url = 'http://api.qingyunke.com/api.php?key=free&appid=0&msg={}'.format(urllib.parse.quote(msg))</code><code> html = requests.get(url)</code><code> return html.json()["content"]</code><code>robotSay = RobotSay()</code><code>speak = Speak()</code><code>readTalk = ReadWav()</code><code>while True:</code><code> speak.my_record() # record voice</code><code> text = readTalk.predict()['result'][0] # speech‑to‑text</code><code> print("本人说:", text)</code><code> response_dialogue = talkWithRobot(text) # chatbot reply</code><code> print("青云客说:", response_dialogue)</code><code> robotSay.say(response_dialogue) # play replySample interaction:
正在讲话...</code><code>讲话结束...</code><code>本人说: 你好呀。</code><code>青云客说: 哟~ 都好都好</code><code>...Future Work
The current implementation is a simple command‑line prototype. Next steps include building a GUI, adding more features, and sharing the complete project with the community.
Python Programming Learning Circle
A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
