Fundamentals 15 min read

Boost Python Speed: 10 Proven Code Optimization Tricks

This article presents practical Python performance tips, covering fundamental optimization principles, avoiding global variables and attribute lookups, reducing unnecessary abstractions, eliminating data copies, leveraging short‑circuit logic, loop improvements, JIT compilation with numba, and choosing efficient built‑in data structures, all illustrated with measurable code examples.

Open Source Linux
Open Source Linux
Open Source Linux
Boost Python Speed: 10 Proven Code Optimization Tricks

0. Code Optimization Principles

Before diving into specific techniques, understand three basic principles: avoid premature optimization, weigh the cost of optimizations, and focus on the parts of code that actually affect performance.

1. Avoid Global Variables

Placing code inside functions rather than the global scope can improve speed by 15‑30%.

# Not recommended (26.8 s)
import math
size = 10000
for x in range(size):
    for y in range(size):
        z = math.sqrt(x) + math.sqrt(y)
# Recommended (20.6 s)
import math

def main():
    size = 10000
    for x in range(size):
        for y in range(size):
            z = math.sqrt(x) + math.sqrt(y)

main()

2. Avoid Attribute Access

2.1 Module and Function Attribute Access

Repeated attribute lookups trigger dictionary operations; importing functions directly removes this overhead.

# First optimization (10.9 s)
from math import sqrt

def compute_sqrt(size: int):
    result = []
    for i in range(size):
        result.append(sqrt(i))
    return result
# Second optimization (9.9 s)
import math

def compute_sqrt(size: int):
    result = []
    sqrt = math.sqrt  # local variable
    for i in range(size):
        result.append(sqrt(i))
    return result

2.2 Class Attribute Access

# Not recommended (10.4 s)
class DemoClass:
    def __init__(self, value: int):
        self._value = value
    def compute_sqrt(self, size: int):
        result = []
        for _ in range(size):
            result.append(math.sqrt(self._value))
        return result
# Recommended (8.0 s)
class DemoClass:
    def __init__(self, value: int):
        self._value = value
    def compute_sqrt(self, size: int):
        result = []
        sqrt = math.sqrt
        value = self._value
        for _ in range(size):
            result.append(sqrt(value))
        return result

3. Avoid Unnecessary Abstraction

# Not recommended (0.55 s)
class DemoClass:
    def __init__(self, value: int):
        self.value = value
    @property
    def value(self) -> int:
        return self._value
    @value.setter
    def value(self, x: int):
        self._value = x
# Recommended (0.33 s)
class DemoClass:
    def __init__(self, value: int):
        self.value = value

4. Avoid Unnecessary Data Copying

4.1 Eliminate Redundant Copies

# Not recommended (6.5 s)
value = range(size)
value_list = [x for x in value]
square_list = [x * x for x in value_list]
# Recommended (4.8 s)
value = range(size)
square_list = [x * x for x in value]

4.2 Swap Without Temporary Variable

# Not recommended (0.07 s)
a = 3
b = 5
temp = a
a = b
b = temp
# Recommended (0.06 s)
a, b = b, a

4.3 Use join for String Concatenation

# Not recommended (2.6 s)
result = ''
for s in string_list:
    result += s
# Recommended (0.3 s)
result = ''.join(string_list)

5. Leverage Short‑Circuit Evaluation

# Recommended (0.03 s)
for s in string_list:
    if s[-1] == '.' and s in abbreviations:
        result += s

6. Loop Optimizations

6.1 Replace while with for

# Not recommended (6.7 s)
i = 0
while i < size:
    sum_ += i
    i += 1
# Recommended (4.3 s)
for i in range(size):
    sum_ += i

6.2 Implicit for Loop

# Recommended (1.7 s)
return sum(range(size))

6.3 Reduce Inner‑Loop Computation

# Not recommended (12.8 s)
for x in range(size):
    for y in range(size):
        z = sqrt(x) + sqrt(y)
# Recommended (7.0 s)
for x in range(size):
    sqrt_x = sqrt(x)
    for y in range(size):
        z = sqrt_x + sqrt(y)

7. Use numba.jit

@numba.jit
def compute_sum(size: float) -> int:
    sum_ = 0
    for i in range(size):
        sum_ += i
    return sum_

8. Choose Appropriate Data Structures

Built‑in containers (list, dict, set, tuple) are implemented in C and are fast; for frequent insert/delete use collections.deque. For fast ordered lookups use bisect. For min/max retrieval use heapq.

References

David Beazley & Brian K. Jones, Python Cookbook , 3rd ed., O'Reilly Media, 2013.

张颖 & 赖勇浩, 编写高质量代码:改善Python程序的91个建议 , 机械工业出版社, 2014.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

performanceprogrammingCode Optimizationbest practices
Open Source Linux
Written by

Open Source Linux

Focused on sharing Linux/Unix content, covering fundamentals, system development, network programming, automation/operations, cloud computing, and related professional knowledge.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.