Backend Development 9 min read

Using Python's subprocess Module in a Data Construction Platform: Basics, Issues, and Solutions

This article explains how the Python subprocess module is employed in a data construction platform, compares run and Popen methods, discusses challenges such as incomplete log capture and process termination, and presents three practical solutions to ensure reliable execution and logging.

360 Quality & Efficiency
360 Quality & Efficiency
360 Quality & Efficiency
Using Python's subprocess Module in a Data Construction Platform: Basics, Issues, and Solutions

In our daily testing we often need to generate large amounts of data, so we built a data construction platform that separates the front‑end from the script execution. The scripts run on a Windows Server (Python 3.8.3) and are launched as child processes using the subprocess module.

The subprocess module allows starting new processes and connecting to their input/output/error streams. The recommended entry point is subprocess.run() , which is synchronous and returns a CompletedProcess object after the process finishes. For more flexible scenarios we use subprocess.Popen() , which creates the process asynchronously and lets us read its output in real time.

Below is a brief overview of common Popen parameters and methods:

subprocess.Popen()常用参数介绍:
args:shell命令,可以是字符串或者序列类型(如:list,元组)
stdin, stdout, stderr:分别表示程序的标准输入、输出、错误句柄
shell:如果该参数为 True,将通过操作系统的 shell 执行指定的命令,args只能是String类型的参数;该参数为False,args可以是序列类型。

Popen 对象常用方法:
poll(): 检查进程是否终止,如果终止返回 returncode,否则返回 None,项目中通过该方法返回判断进程是否执行结束。
wait(timeout): 等待子进程终止,如果进程执行时间较长,可以使用该方法来保证进程执行完整。
communicate(input,timeout): 和子进程交互,发送和读取数据。
send_signal(singnal): 发送信号到子进程 。
terminate(): 停止子进程,也就是发送SIGTERM信号到子进程。
kill(): 杀死子进程。发送 SIGKILL 信号到子进程。

We compared the synchronous run() with the asynchronous Popen() . Popen can capture logs while the child process is still running, whereas run() only returns logs after the process finishes. Because our data construction tasks are long‑running and require real‑time logs, we chose Popen .

Problems encountered and solutions

1. Ensuring complete log capture – The initial approach used poll() to detect process termination, but this sometimes missed the final log output. We introduced a second scheme that checks whether the log stream has been fully read, and a third scheme that combines both poll() and log‑read checks. The final implementation also filters out empty lines to keep the log tidy.

2. Guaranteeing proper termination of script processes – Calling Popen.terminate() only killed the shell process when shell=True , leaving the actual command running. The official description of terminate() is:

Stop the child. On POSIX OSs the method sends SIGTERM to the child. On Windows the Win32 API function TerminateProcess() is called to stop the child

Because shell=True creates an intermediate shell, the termination signal does not reach the real child process. We solved this in three ways:

Set shell=False when creating the Popen object: subprocess.Popen(command, shell=False) .

When shell=False is not possible (our command must be a string), use the psutil library to enumerate and kill all descendant processes.

On Windows, directly invoke taskkill /t /f /pid {pid} to force‑kill the process tree without extra dependencies.

These solutions ensure that both the process and any spawned children are terminated correctly, and that logs are captured completely and cleanly.

Overall, the article shares practical insights and mitigation strategies for using Python's subprocess module in a production environment.

backendprocess managementloggingwindowspsutilsubprocess
360 Quality & Efficiency
Written by

360 Quality & Efficiency

360 Quality & Efficiency focuses on seamlessly integrating quality and efficiency in R&D, sharing 360’s internal best practices with industry peers to foster collaboration among Chinese enterprises and drive greater efficiency value.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.