Fundamentals 10 min read

How to Speed Up Python Programs: Profiling, Timing, and Practical Optimization Techniques

This article explains why Python itself is not slow, demonstrates how to identify bottlenecks with timing and cProfile, and provides a collection of practical tips—such as using built‑in types, lru_cache, local variables, and efficient string formatting—to improve Python program performance by up to 30 percent.

Python Programming Learning Circle
Python Programming Learning Circle
Python Programming Learning Circle
How to Speed Up Python Programs: Profiling, Timing, and Practical Optimization Techniques

Python is popular worldwide, but many still claim it runs slowly; the real cause is often inefficient code rather than the language itself.

Before optimizing, you must locate the slow parts of your program. Simple tools like the Unix time command can give you an overall runtime, while cProfile provides detailed per‑function statistics sorted by cumulative time ( cumtime ).

<code>~ $ time python3.8 slow_program.py
real  0m11,058s
user  0m11,050s
sys   0m0,008s</code>
<code>~ $ python3.8 -m cProfile -s time slow_program.py
         1297 function calls (1272 primitive calls) in 11.081 seconds
Ordered by: internal time
ncalls  tottime  percall  cumtime  percall  filename:lineno(function)
    3    11.079   3.693   11.079   3.693  slow_program.py:4(exp)
    ... (other entries omitted)</code>

From the profiling output, the exp function is the main culprit. A simple decorator can time individual functions:

<code>def timeit_wrapper(func):
    @wraps(func)
    def wrapper(*args, **kwargs):
        start = time.perf_counter()
        result = func(*args, **kwargs)
        end = time.perf_counter()
        print('{0:<10}.{1:<8} : {2:<8}'.format(func.__module__, func.__name__, end - start))
        return result
    return wrapper</code>

Applying the decorator to exp yields per‑call timings:

<code>~ $ python3.8 slow_program.py
module function   time
__main__ .exp      : 0.0032675
__main__ .exp      : 0.0385353
__main__ .exp      : 11.7284861</code>

When measuring time, choose the appropriate function: time.perf_counter includes wall‑clock time (affected by system load), while time.process_time measures only CPU time.

Additional optimization tips include:

Prefer built‑in data types (implemented in C) over custom structures.

Use functools.lru_cache to memoize expensive function results.

<code>import functools, time
@functools.lru_cache(maxsize=12)
def slow_func(x):
    time.sleep(2)  # simulate long computation
    return x

slow_func(1)  # first call: 2 s delay
slow_func(1)  # cached: instant</code>

Store frequently accessed values in local variables to avoid repeated global lookups:

<code>class FastClass:
    def do_stuff(self):
        temp = self.value  # faster than repeated attribute access
        for i in range(10000):
            ...  # use temp</code>

Wrap script code in a main() function and call it once to reduce global‑variable overhead.

<code>def main():
    # all previous global code goes here
    ...

main()</code>

Avoid costly attribute lookups via __getattribute__ ; import needed functions directly instead of using the module each time.

<code># Slow
import re
for i in range(10000):
    re.findall(regex, line)

# Fast
from re import findall
for i in range(10000):
    findall(regex, line)</code>

Be careful with string concatenation in loops; use f‑strings, which are the fastest, followed by ``+`` and ``''.join``; older methods like ``%`` formatting or ``str.format`` are slower.

<code>f'{s} {t}'   # Fast!
s + ' ' + t
' '.join((s, t))
'%s %s' % (s, t)
'{} {}'.format(s, t)
Template('$s $t').substitute(s=s, t=t)  # Slow!</code>

Generators save memory but not time; however, reduced memory pressure can improve cache performance for large data sets.

Overall, the primary rule is "don't optimize prematurely"; when you do need to optimize, apply the above techniques judiciously, keeping code readable and maintainable.

PerformanceOptimizationPythonprofilinglru-cacheTimingcProfile
Python Programming Learning Circle
Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.