Why Is Python So Slow? Boost Speed 1000× with NumPy UFuncs
This article examines Python's notorious performance lag, explains why its dynamic typing and object overhead make simple loops sluggish, and demonstrates how NumPy's universal functions can accelerate reciprocal calculations by over a thousand times, outperforming even compiled languages.
1. How slow is Python really?
Python often ranks at the bottom of language speed contests because it is interpreted, but languages like Java are also interpreted yet much faster. A benchmark using a traditional for loop to compute the reciprocal of one million numbers shows Python taking about 3.37 seconds, while C finishes in 9 ms, C# in 19 ms, Node.js in 26 ms, and Java in 5 ms.
import numpy as np
np.random.seed(0)
values = np.random.randint(1, 100, size=1000000)
def get_reciprocal(values):
output = np.empty(len(values))
for i in range(len(values)):
output[i] = 1.0/values[i]
%timeit get_reciprocal(values)The result: each loop averages 3.37 seconds (±582 ms) over seven runs.
2. The root cause of Python's slowness
Python is a dynamically‑typed language where every variable is an object. Each operation requires unboxing, type checking, and attribute lookup, which adds significant overhead inside loops. In contrast, compiled languages access data directly without such checks.
Even a simple assignment like a = 1 involves two steps: setting the object's type code to Integer and storing the value.
Step 1: Set a->PyObject_HEAD->typecode to Integer.
Step 2: Assign the value 1 to a->val.
3. The answer: NumPy universal functions (UFuncs)
NumPy arrays are built around C arrays, so accessing elements does not require type checks. Using a UFunc to compute the reciprocal of an entire array eliminates the loop overhead.
import numpy as np
np.random.seed(0)
values = np.random.randint(1, 100, size=1000000)
%timeit result = 1.0/valuesThis vectorized version runs in about 2.71 ms (±50.8 µs), roughly 2.7 ms per loop, a speedup of more than a thousand times compared to the pure Python loop.
4. Summary
For Python developers handling numeric data, storing values in NumPy arrays or Pandas DataFrames (which are based on NumPy) allows the use of UFuncs for massive speed gains. Operations that once took seconds can now finish faster than equivalent C code, making Python surprisingly fast when leveraged correctly.
5. Appendix – Test code for C, C#, Java, and Node.js
C:
#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>
int main(){
struct timeval stop, start;
int length = 1000000;
int rand_array[length];
float output_array[length];
for(int i = 0; i<length; i++){
rand_array[i] = rand();
}
gettimeofday(&start, NULL);
for(int i = 0; i<length; i++){
output_array[i] = 1.0/(rand_array[i]*1.0);
}
gettimeofday(&stop, NULL);
printf("took %lu us
", (stop.tv_sec - start.tv_sec) * 1000000 + stop.tv_usec - start.tv_usec);
return 0;
}C# (.NET 5.0):
using System;
namespace speed_test{
class Program{
static void Main(string[] args){
int length = 1000000;
double[] rand_array = new double[length];
double[] output = new double[length];
var rand = new Random();
for(int i =0; i<length;i++){
rand_array[i] = rand.Next();
}
long start = DateTimeOffset.Now.ToUnixTimeMilliseconds();
for(int i =0; i<length;i++){
output[i] = 1.0/rand_array[i];
}
long end = DateTimeOffset.Now.ToUnixTimeMilliseconds();
Console.WriteLine(end - start);
}
}
}Java:
import java.util.Random;
public class speed_test {
public static void main(String[] args){
int length = 1000000;
long[] rand_array = new long[length];
double[] output = new double[length];
Random rand = new Random();
for(int i =0; i<length; i++){
rand_array[i] = rand.nextLong();
}
long start = System.currentTimeMillis();
for(int i =0; i<length; i++){
output[i] = 1.0/rand_array[i];
}
long end = System.currentTimeMillis();
System.out.println(end - start);
}
}Node.js:
let length = 1000000;
let rand_array = [];
let output = [];
for(var i=0;i<length;i++){
rand_array[i] = Math.floor(Math.random()*10000000);
}
let start = (new Date()).getMilliseconds();
for(var i=0;i<length;i++){
output[i] = 1.0/rand_array[i];
}
let end = (new Date()).getMilliseconds();
console.log(end - start);Original article: https://python.plainenglish.io/a-solution-to-boost-python-speed-1000x-times-c9e7d5be2f40
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
