Fundamentals 20 min read

Python Code Optimization Techniques to Improve Execution Speed

This article presents a collection of Python performance optimization strategies, including avoiding global variables, minimizing attribute access, reducing unnecessary abstractions, eliminating data copying, leveraging short‑circuit logic, optimizing loops, using JIT compilation with numba, and selecting appropriate data structures to significantly speed up code execution.

Python Programming Learning Circle
Python Programming Learning Circle
Python Programming Learning Circle
Python Code Optimization Techniques to Improve Execution Speed

0. Code Optimization Principles

This article introduces many Python speed‑up techniques. Before diving into details, you need to understand some basic optimization principles.

First principle: Do not premature optimize

Many developers start optimizing code immediately, but the premise of optimization is that the code works correctly. Optimizing too early can distract from overall performance goals.

Second principle: Weigh the cost of optimization

Optimization has a cost; solving all performance problems is almost impossible. Usually you trade time for space or space for time, and you must also consider development effort.

Third principle: Do not optimize irrelevant parts

If you try to optimize every part of the code, it becomes hard to read and understand. Identify the slow parts (often inner loops) and focus optimization there.

1. Avoid Global Variables

# Not recommended. Execution time: 26.8 seconds
import math
size = 10000
for x in range(size):
    for y in range(size):
        z = math.sqrt(x) + math.sqrt(y)

Placing code at the global scope makes variable look‑ups slower than inside a function. Wrapping the script in a function can give a 15‑30% speed boost.

# Recommended. Execution time: 20.6 seconds
import math

def main():
    size = 10000
    for x in range(size):
        for y in range(size):
            z = math.sqrt(x) + math.sqrt(y)

main()

2. Avoid . (Attribute Access)

2.1 Avoid module and function attribute access

# Not recommended. Execution time: 14.5 seconds
import math

def computeSqrt(size: int):
    result = []
    for i in range(size):
        result.append(math.sqrt(i))
    return result

def main():
    size = 10000
    for _ in range(size):
        result = computeSqrt(size)
    
main()

Each attribute access triggers __getattribute__ or __getattr__, which involve dictionary look‑ups and add overhead. Import the needed names directly to eliminate attribute access.

# First optimization. Execution time: 10.9 seconds
from math import sqrt

def computeSqrt(size: int):
    result = []
    for i in range(size):
        result.append(sqrt(i))
    return result

def main():
    size = 10000
    for _ in range(size):
        result = computeSqrt(size)
    
main()

Local variables are faster than globals; assigning frequently used functions (e.g., sqrt ) to a local variable speeds up the inner loop.

# Second optimization. Execution time: 9.9 seconds
import math

def computeSqrt(size: int):
    result = []
    sqrt = math.sqrt  # local alias
    for i in range(size):
        result.append(sqrt(i))
    return result

def main():
    size = 10000
    for _ in range(size):
        result = computeSqrt(size)
    
main()

Similarly, assigning list.append to a local variable removes repeated attribute look‑ups.

# Third optimization. Execution time: 7.9 seconds
import math

def computeSqrt(size: int):
    result = []
    append = result.append
    sqrt = math.sqrt
    for i in range(size):
        append(sqrt(i))
    return result

def main():
    size = 10000
    for _ in range(size):
        result = computeSqrt(size)
    
main()

2.2 Avoid class attribute access

# Not recommended. Execution time: 10.4 seconds
import math
from typing import List

class DemoClass:
    def __init__(self, value: int):
        self._value = value
    def computeSqrt(self, size: int) -> List[float]:
        result = []
        for _ in range(size):
            result.append(math.sqrt(self._value))
        return result

def main():
    size = 10000
    for _ in range(size):
        demo = DemoClass(size)
        demo.computeSqrt(size)
    
main()

Accessing self._value inside a tight loop is slower than using a local variable.

# Recommended. Execution time: 8.0 seconds
import math
from typing import List

class DemoClass:
    def __init__(self, value: int):
        self._value = value
    def computeSqrt(self, size: int) -> List[float]:
        result = []
        sqrt = math.sqrt
        val = self._value
        for _ in range(size):
            result.append(sqrt(val))
        return result

def main():
    size = 10000
    for _ in range(size):
        demo = DemoClass(size)
        demo.computeSqrt(size)
    
main()

3. Avoid Unnecessary Abstractions

# Not recommended. Execution time: 0.55 seconds
class DemoClass:
    def __init__(self, value: int):
        self.value = value
    @property
    def value(self) -> int:
        return self._value
    @value.setter
    def value(self, x: int):
        self._value = x

def main():
    size = 1000000
    for i in range(size):
        demo = DemoClass(size)
        v = demo.value
        demo.value = i
    
main()

Extra layers such as decorators, property getters/setters, or descriptors add overhead. Use plain attributes when possible.

# Recommended. Execution time: 0.33 seconds
class DemoClass:
    def __init__(self, value: int):
        self.value = value

def main():
    size = 1000000
    for i in range(size):
        demo = DemoClass(size)
        v = demo.value
        demo.value = i
    
main()

4. Avoid Unnecessary Data Copying

4.1 Avoid meaningless data copies

# Not recommended. Execution time: 6.5 seconds
def main():
    size = 10000
    for _ in range(size):
        value = range(size)
        value_list = [x for x in value]
        square_list = [x * x for x in value_list]
    
main()

Creating value_list is unnecessary and wastes memory.

# Recommended. Execution time: 4.8 seconds
def main():
    size = 10000
    for _ in range(size):
        value = range(size)
        square_list = [x * x for x in value]
    
main()

4.2 Swap values without a temporary variable

# Not recommended. Execution time: 0.07 seconds
def main():
    size = 1000000
    for _ in range(size):
        a = 3
        b = 5
        temp = a
        a = b
        b = temp
    
main()

Using a temporary variable adds overhead.

# Recommended. Execution time: 0.06 seconds
def main():
    size = 1000000
    for _ in range(size):
        a = 3
        b = 5
        a, b = b, a  # tuple unpacking
    
main()

4.3 Use join instead of + for string concatenation

# Not recommended. Execution time: 2.6 seconds
import string
from typing import List

def concatString(string_list: List[str]) -> str:
    result = ''
    for s in string_list:
        result += s
    return result

def main():
    string_list = list(string.ascii_letters * 100)
    for _ in range(10000):
        result = concatString(string_list)
    
main()

Using + creates a new immutable string each iteration, leading to O(n) intermediate objects.

# Recommended. Execution time: 0.3 seconds
import string
from typing import List

def concatString(string_list: List[str]) -> str:
    return ''.join(string_list)

def main():
    string_list = list(string.ascii_letters * 100)
    for _ in range(10000):
        result = concatString(string_list)
    
main()

5. Exploit Short‑Circuit Behaviour of if

# Not recommended. Execution time: 0.05 seconds
from typing import List

def concatString(string_list: List[str]) -> str:
    abbreviations = {'cf.', 'e.g.', 'ex.', 'etc.', 'flg.', 'i.e.', 'Mr.', 'vs.'}
    result = ''
    for s in string_list:
        if s in abbreviations:
            result += s
    return result

def main():
    for _ in range(10000):
        lst = ['Mr.', 'Hat', 'is', 'Chasing', 'the', 'black', 'cat', '.']
        concatString(lst)
    
main()

Place the condition that is more likely to be true before the or part, and the condition that is likely false after the and part, so the interpreter can skip unnecessary checks.

# Recommended. Execution time: 0.03 seconds
from typing import List

def concatString(string_list: List[str]) -> str:
    abbreviations = {'cf.', 'e.g.', 'ex.', 'etc.', 'flg.', 'i.e.', 'Mr.', 'vs.'}
    result = ''
    for s in string_list:
        if s[-1] == '.' and s in abbreviations:  # short‑circuit
            result += s
    return result

def main():
    for _ in range(10000):
        lst = ['Mr.', 'Hat', 'is', 'Chasing', 'the', 'black', 'cat', '.']
        concatString(lst)
    
main()

6. Loop Optimizations

6.1 Use for instead of while

# Not recommended. Execution time: 6.7 seconds
def computeSum(size: int) -> int:
    sum_ = 0
    i = 0
    while i < size:
        sum_ += i
        i += 1
    return sum_

def main():
    size = 10000
    for _ in range(size):
        computeSum(size)
    
main()

Python's for loop is faster than a manual while loop.

# Recommended. Execution time: 4.3 seconds
def computeSum(size: int) -> int:
    sum_ = 0
    for i in range(size):
        sum_ += i
    return sum_

def main():
    size = 10000
    for _ in range(size):
        computeSum(size)
    
main()

6.2 Use implicit for (built‑ins) instead of explicit loops

# Recommended. Execution time: 1.7 seconds
def computeSum(size: int) -> int:
    return sum(range(size))

def main():
    size = 10000
    for _ in range(size):
        computeSum(size)
    
main()

6.3 Reduce work inside inner loops

# Not recommended. Execution time: 12.8 seconds
import math

def main():
    size = 10000
    for x in range(size):
        for y in range(size):
            z = math.sqrt(x) + math.sqrt(y)
    
main()

Calling math.sqrt inside the inner loop repeats work.

# Recommended. Execution time: 7.0 seconds
import math

def main():
    size = 10000
    sqrt = math.sqrt
    for x in range(size):
        sqrt_x = sqrt(x)  # compute once per outer iteration
        for y in range(size):
            z = sqrt_x + sqrt(y)
    
main()

7. Use numba.jit for JIT compilation

# Recommended. Execution time: 0.62 seconds
import numba

@numba.jit
def computeSum(size: int) -> int:
    sum_ = 0
    for i in range(size):
        sum_ += i
    return sum_

def main():
    size = 10000
    for _ in range(size):
        computeSum(size)
    
main()

8. Choose Appropriate Data Structures

Built‑in structures such as list , tuple , dict , set , and str are implemented in C and are very fast. For frequent insert/delete operations, consider collections.deque , which provides O(1) operations at both ends. For fast look‑ups in a sorted collection, use the bisect module. For frequent min/max queries, the heapq module turns a list into a heap with O(1) access to the smallest element.

References

https://zhuanlan.zhihu.com/p/143052860

David Beazley & Brian K. Jones, *Python Cookbook*, 3rd edition, O'Reilly, 2013.

张颖 & 赖勇浩, *编写高质量代码:改善Python程序的91个建议*, 机械工业出版社, 2014.

performanceOptimizationbest practicescode
Python Programming Learning Circle
Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.