How I Boosted My Python Script Speed by 300%: 10 Proven Optimization Tricks
This article walks through ten practical techniques—including profiling with cProfile, using built‑in functions, list comprehensions, avoiding globals, leveraging NumPy, generators, multiprocessing, caching, selective imports, and upgrading Python—to dramatically accelerate Python scripts handling large data sets.
I still remember when I first wrote a Python script that needed to handle a large dataset. It was a small project, but processing so much data made me wait for a long time.
A task that could be finished in minutes was delayed to hours.
Then I realized something was wrong: my code was not optimized. After many attempts I learned how to optimize it, which made my script 300% faster.
Let’s go step by step to see how I did it.
1. Profile the code first
First, we need to identify the parts of the code that slow it down, without guessing.
In Python we can use cProfile , which helps us see which part of the program consumes the most time.
Here’s how I use it:
<code>import cProfile
def my_script():
# Your code
pass
cProfile.run('my_script()')
</code>It breaks down the time spent in each function.
When I used cProfile in my script, I found that 80% of the runtime was spent in two functions, so I decided to optimize those.
2. Use built‑in functions
Python’s built‑in functions are written in C and run much faster than functions we write in pure Python.
In my code I originally used a for loop to sum numbers in a list:
<code>total = 0
for num in my_list:
total += num
</code>I replaced it with the built‑in sum function:
<code>total = sum(my_list)
</code>This change reduced the runtime of that part by 50%.
3. Replace loops with list comprehensions
Loops are easy to write but not always the fastest. I had a loop that created a new list from an existing one:
<code>new_list = []
for item in old_list:
if item > 10:
new_list.append(item)
</code>I switched to a list comprehension:
<code>new_list = [item for item in old_list if item > 10]
</code>This change doubled the speed.
4. Avoid global variables
Global variables slow scripts because Python must check multiple scopes to find their values.
I originally used a global counter :
<code>counter = 0
def count_items(items):
global counter
for item in items:
counter += 1
</code>I moved the counter inside the function:
<code>def count_items(items):
counter = 0
for item in items:
counter += 1
</code>This change significantly improved speed.
5. Replace regular loops with NumPy
Initially I used a Python list to process numeric data, then switched to NumPy, which is implemented in C.
Before NumPy:
<code>result = [x * 2 for x in my_list]
</code>With NumPy:
<code>import numpy as np
array = np.array(my_list)
result = array * 2
</code>For large lists, the NumPy version runs about ten times faster.
6. Use generators for large data
My script originally stored intermediate results in a list, consuming a lot of memory:
<code>squares = [x**2 for x in range(1_000_000)]
</code>Switching to a generator processes items one by one:
<code>squares = (x**2 for x in range(1_000_000))
</code>This reduces memory usage and speeds up the script.
7. Use multiprocessing for parallel processing
By default Python runs on a single core. Using the multiprocessing library distributes work across available cores.
Example:
<code>from multiprocessing import Pool
def process_data(item):
# Your processing logic
return item * 2
if __name__ == "__main__":
with Pool() as pool:
results = pool.map(process_data, my_list)
</code>This gave a 100% speed boost because tasks ran in parallel.
8. Avoid duplicate work
Previously a function repeatedly computed the same result:
<code>for item in my_list:
result = expensive_function(item)
print(result)
</code>I introduced caching (memoization):
<code>cache = {}
for item in my_list:
if item not in cache:
cache[item] = expensive_function(item)
print(cache[item])
</code>Memoization saved a lot of time.
9. Optimize imports
Instead of importing an entire library just to use one function, I import only what I need:
<code># Before
import pandas
result = pandas.read_csv("data.csv")
# After
from pandas import read_csv
result = read_csv("data.csv")
</code>This speeds up script start‑up time.
10. Use the latest Python version
Upgrading to a newer Python version improved script speed even without code changes.
Code Mala Tang
Read source code together, write articles together, and enjoy spicy hot pot together.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.