Fundamentals 14 min read

How to Record and Play Audio in Python with PyAudio and Pynput

This article demonstrates how to use Python's PyAudio library and the wave module to record and playback audio on Windows 10, covering installation, stream handling with callbacks, device selection, GUI integration, and keyboard hotkey control via pynput, complete with full source code examples.

Python Crawling & Data Mining
Python Crawling & Data Mining
Python Crawling & Data Mining
How to Record and Play Audio in Python with PyAudio and Pynput

Hello, I'm the author and in this article I share how to create an audio recording tool in Python using PyAudio, with the following outline:

Application platform

Audio recording part

Audio playback part

GUI window required attribute code

Pynput keyboard listener

Summary

Motivated by the idea of building a screen‑recording tool, I decided to start a series on using Python for audio recording and playback.

Application Platform

Windows 10

Python 3.7

Audio Recording Part

Audio recording is similar to video recording, using data frames. Install the required package with: pip install PyAudio If installation fails, download the appropriate .whl file for Python 3.7 and 64‑bit Windows and install it with: pip install PyAudio-xx.whl Below is the main recording code (streaming with a callback):

from pyaudio import PyAudio, paInt16, paContinue, paComplete

# Fixed parameters
chunk = 1024  # frames per buffer
format_sample = paInt16  # sample format
channels = 2  # 1 = mono, 2 = stereo
fps = 44100  # sampling rate

def callback(in_data, frame_count, time_info, status):
    """Recording callback"""
    wf.writeframes(in_data)
    if xx:  # condition to stop
        return in_data, paContinue
    else:
        return in_data, paComplete

p = PyAudio()
stream = p.open(format=format_sample,
    channels=channels,
    rate=fps,
    frames_per_buffer=chunk,
    input=True,
    input_device_index=None,  # default device
    stream_callback=callback)
stream.start_stream()
while stream.is_active():
    time.sleep(0.1)
stream.stop_stream()
stream.close()
 p.terminate()

First create a wave file to store the audio:

import wave
wf = wave.open('test.wav', 'wb')
wf.setnchannels(channels)
wf.setsampwidth(p.get_sample_size(format_sample))
wf.setframerate(fps)

To make the code reusable, it can be wrapped into a class (partial example shown):

from pyaudio import PyAudio

class AudioRecord(PyAudio):
    def __init__(self):
        super().__init__()
        self.chunk = 1024
        self.format_sample = paInt16
        self.channels = 2
        self.fps = 44100
        # additional attributes ...

Audio Playback Part

The playback code mirrors the recording code. Core snippet:

wf = wave.open('test.wav', 'rb')

def callback(in_data, frame_count, time_info, status):
    data = wf.readframes(frame_count)
    return data, paContinue

stream = p.open(format=p.get_format_from_width(wf.getsampwidth()),
    channels=wf.getnchannels(),
    rate=wf.getframerate(),
    output=True,
    output_device_index=output_device_index,
    stream_callback=callback)
stream.start_stream()
while stream.is_active():
    time.sleep(0.1)

Both .wav and .mp3 formats have been tested successfully.

GUI Window Required Attribute Code Part

For a user‑friendly GUI, the following snippets retrieve audio duration and list input/output devices:

# Audio duration
duration = wf.getnframes() / wf.getframerate()
# Get system devices
dev_info = self.get_device_info_by_index(i)
default_rate = int(dev_info['defaultSampleRate'])
if not dev_info['hostApi'] and default_rate == fps and '映射器' not in dev_info['name']:
    if dev_info['maxInputChannels']:
        print('输入设备:', dev_info['name'])
    elif dev_info['maxOutputChannels']:
        print('输出设备:', dev_info['name'])

Pynput Keyboard Listener

The pynput library is used to listen for hotkeys that stop or cancel recording:

def hotkey(self):
    """Hotkey listener"""
    with keyboard.Listener(on_press=self.on_press) as listener:
        listener.join()

def on_press(self, key):
    try:
        if key.char == 't':  # stop and save
            self.flag = True
        elif key.char == 'k':  # cancel and delete
            self.flag = True
            self.kill = True
    except Exception as e:
        print(e)

Summary

This article covered using PyAudio on Windows to record and play audio, demonstrated class‑based design, callback streaming, device handling, GUI integration, and keyboard hotkey control with pynput. The provided source code illustrates the core concepts and can be extended further.

import wave
import time
from pathlib import Path
from threading import Thread
from pyaudio import PyAudio, paInt16, paContinue, paComplete
from pynput import keyboard  # pip install pynput

class AudioRecord(PyAudio):
    def __init__(self, channels=2):
        super().__init__()
        self.chunk = 1024
        self.format_sample = paInt16
        self.channels = channels
        self.fps = 44100
        self.input_dict = None
        self.output_dict = None
        self.stream = None
        self.filename = '~test.wav'
        self.duration = 0
        self.flag = False
        self.kill = False

    def __call__(self, filename):
        """Override filename"""
        self.filename = filename

    def callback_input(self, in_data, frame_count, time_info, status):
        """Recording callback"""
        self.wf.writeframes(in_data)
        if not self.flag:
            return in_data, paContinue
        else:
            return in_data, paComplete

    def callback_output(self, in_data, frame_count, time_info, status):
        """Playback callback"""
        data = self.wf.readframes(frame_count)
        return data, paContinue

    def open_stream(self, name):
        """Open recording stream"""
        input_device_index = self.get_device_index(name, True) if name else None
        return self.open(format=self.format_sample,
                         channels=self.channels,
                         rate=self.fps,
                         frames_per_buffer=self.chunk,
                         input=True,
                         input_device_index=input_device_index,
                         stream_callback=self.callback_input)

    def audio_record_run(self, name=None):
        """Audio recording"""
        self.wf = self.save_audio_file(self.filename)
        self.stream = self.open_stream(name)
        self.stream.start_stream()
        while self.stream.is_active():
            time.sleep(0.1)
        self.wf.close()
        if self.kill:
            Path(self.filename).unlink()
        self.duration = self.get_duration(self.wf)
        print(self.duration)
        self.terminate_run()

    def run(self, filename=None, name=None, record=True):
        """Audio recording thread"""
        thread_1 = Thread(target=self.hotkey, daemon=True)
        if record:
            if filename:
                self.filename = filename
            thread_2 = Thread(target=self.audio_record_run, args=(name,))
        else:
            if not filename:
                raise Exception('No audio filename provided for playback')
            thread_2 = Thread(target=self.read_audio, args=(filename, name,))
        thread_1.start()
        thread_2.start()

    def read_audio(self, filename, name=None):
        """Audio playback"""
        output_device_index = self.get_device_index(name, False) if name else None
        with wave.open(filename, 'rb') as self.wf:
            self.duration = self.get_duration(self.wf)
            self.stream = self.open(format=self.get_format_from_width(self.wf.getsampwidth()),
                                   channels=self.wf.getnchannels(),
                                   rate=self.wf.getframerate(),
                                   output=True,
                                   output_device_index=output_device_index,
                                   stream_callback=self.callback_output)
            self.stream.start_stream()
            while self.stream.is_active():
                time.sleep(0.1)
        print(self.duration)
        self.terminate_run()

    @staticmethod
    def get_duration(wf):
        """Get audio duration"""
        return round(wf.getnframes() / wf.getframerate(), 2)

    def get_in_out_devices(self):
        """Get system input/output devices"""
        self.input_dict = {}
        self.output_dict = {}
        for i in range(self.get_device_count()):
            dev_info = self.get_device_info_by_index(i)
            default_rate = int(dev_info['defaultSampleRate'])
            if not dev_info['hostApi'] and default_rate == self.fps and '映射器' not in dev_info['name']:
                if dev_info['maxInputChannels']:
                    self.input_dict[dev_info['name']] = i
                elif dev_info['maxOutputChannels']:
                    self.output_dict[dev_info['name']] = i

    def get_device_index(self, name, input_in=True):
        """Get selected device index"""
        if input_in and self.input_dict:
            return self.input_dict.get(name, -1)
        elif not input_in and self.output_dict:
            return self.output_dict.get(name, -1)

    def save_audio_file(self, filename):
        """Save audio file"""
        wf = wave.open(filename, 'wb')
        wf.setnchannels(self.channels)
        wf.setsampwidth(self.get_sample_size(self.format_sample))
        wf.setframerate(self.fps)
        return wf

    def terminate_run(self):
        """Stop and close streams"""
        if self.stream:
            self.stream.stop_stream()
            self.stream.close()
        self.terminate()

    def hotkey(self):
        """Hotkey listener"""
        with keyboard.Listener(on_press=self.on_press) as listener:
            listener.join()

    def on_press(self, key):
        try:
            if key.char == 't':
                self.flag = True
            elif key.char == 'k':
                self.flag = True
                self.kill = True
        except Exception as e:
            print(e)

if __name__ == '__main__':
    audio_record = AudioRecord()
    audio_record.get_in_out_devices()
    print(audio_record.input_dict)
    audio_record.run('test.mp3')
    print(audio_record.output_dict)
    audio_record.run('test.mp3', record=False)
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

WindowsAudio Playbackpynputaudio recordingPyAudio
Python Crawling & Data Mining
Written by

Python Crawling & Data Mining

Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.