Fundamentals 19 min read

Boost Python Speed: 10 Proven Tricks to Accelerate Your Code

This article presents practical Python performance‑boosting techniques—including avoiding global variables, minimizing attribute access, eliminating unnecessary abstractions, reducing data copies, optimizing loops, leveraging built‑in functions, and using tools like numba—each illustrated with before‑and‑after code snippets and measured speed improvements.

MaGe Linux Operations
MaGe Linux Operations
MaGe Linux Operations
Boost Python Speed: 10 Proven Tricks to Accelerate Your Code

Python is a scripting language that, compared with compiled languages like C/C++, has some efficiency and performance limitations, but its speed is often better than expected; this article collects techniques to accelerate Python code.

0. Code Optimization Principles

Before diving into detailed optimizations, understand basic principles: avoid premature optimization, weigh the cost of optimization, and focus on the parts of code that actually impact performance.

1. Avoid Global Variables

# Not recommended. Execution time: 26.8 seconds
import math

size = 10000
for x in range(size):
    for y in range(size):
        z = math.sqrt(x) + math.sqrt(y)

Placing code at the module level makes variable look‑ups slower. Wrapping the script in a function can give a 15%‑30% speed boost.

# Recommended. Execution time: 20.6 seconds
import math

def main():
    size = 10000
    for x in range(size):
        for y in range(size):
            z = math.sqrt(x) + math.sqrt(y)

main()

2. Avoid Attribute Access (.)

2.1 Avoid Module and Function Attribute Access

# Not recommended. Execution time: 14.5 seconds
import math

def computeSqrt(size: int):
    result = []
    for i in range(size):
        result.append(math.sqrt(i))
    return result

def main():
    size = 10000
    for _ in range(size):
        result = computeSqrt(size)

main()

Each attribute lookup triggers dictionary operations, adding overhead. Importing the function directly removes this cost.

# First optimization. Execution time: 10.9 seconds
from math import sqrt

def computeSqrt(size: int):
    result = []
    for i in range(size):
        result.append(sqrt(i))  # avoid math.sqrt
    return result

def main():
    size = 10000
    for _ in range(size):
        result = computeSqrt(size)

main()

Assigning frequently used functions to local variables speeds up look‑ups.

# Second optimization. Execution time: 9.9 seconds
import math

def computeSqrt(size: int):
    result = []
    sqrt = math.sqrt  # local variable
    for i in range(size):
        result.append(sqrt(i))
    return result

def main():
    size = 10000
    for _ in range(size):
        result = computeSqrt(size)

main()

Eliminate both the method call and the attribute lookup.

# Recommended. Execution time: 7.9 seconds
import math

def computeSqrt(size: int):
    result = []
    append = result.append
    sqrt = math.sqrt
    for i in range(size):
        append(sqrt(i))  # avoid result.append and math.sqrt
    return result

def main():
    size = 10000
    for _ in range(size):
        result = computeSqrt(size)

main()

2.2 Avoid Class Attribute Access

# Not recommended. Execution time: 10.4 seconds
import math
from typing import List

class DemoClass:
    def __init__(self, value: int):
        self._value = value

    def computeSqrt(self, size: int) -> List[float]:
        result = []
        append = result.append
        sqrt = math.sqrt
        for _ in range(size):
            append(sqrt(self._value))
        return result

def main():
    size = 10000
    for _ in range(size):
        demo_instance = DemoClass(size)
        result = demo_instance.computeSqrt(size)

main()

Accessing self._value repeatedly is slower than using a local variable.

# Recommended. Execution time: 8.0 seconds
import math
from typing import List

class DemoClass:
    def __init__(self, value: int):
        self._value = value

    def computeSqrt(self, size: int) -> List[float]:
        result = []
        append = result.append
        sqrt = math.sqrt
        value = self._value
        for _ in range(size):
            append(sqrt(value))  # avoid self._value
        return result

def main():
    size = 10000
    for _ in range(size):
        demo_instance = DemoClass(size)
        demo_instance.computeSqrt(size)

main()

3. Avoid Unnecessary Abstraction

# Not recommended. Execution time: 0.55 seconds
class DemoClass:
    def __init__(self, value: int):
        self.value = value

    @property
    def value(self) -> int:
        return self._value

    @value.setter
    def value(self, x: int):
        self._value = x

def main():
    size = 1000000
    for i in range(size):
        demo_instance = DemoClass(size)
        value = demo_instance.value
        demo_instance.value = i

main()

Extra layers such as property decorators add overhead; use plain attributes when no special behavior is needed.

# Recommended. Execution time: 0.33 seconds
class DemoClass:
    def __init__(self, value: int):
        self.value = value  # simple attribute, no property

def main():
    size = 1000000
    for i in range(size):
        demo_instance = DemoClass(size)
        value = demo_instance.value
        demo_instance.value = i

main()

4. Avoid Data Copying

4.1 Avoid Meaningless Data Copying

# Not recommended. Execution time: 6.5 seconds
def main():
    size = 10000
    for _ in range(size):
        value = range(size)
        value_list = [x for x in value]
        square_list = [x * x for x in value_list]

main()

Creating value_list is unnecessary; it duplicates data.

# Recommended. Execution time: 4.8 seconds
def main():
    size = 10000
    for _ in range(size):
        value = range(size)
        square_list = [x * x for x in value]  # avoid meaningless copy

main()

Avoid overusing copy.deepcopy when Python's memory model already shares objects efficiently.

4.2 Swap Values Without Temporary Variable

# Not recommended. Execution time: 0.07 seconds
def main():
    size = 1000000
    for _ in range(size):
        a = 3
        b = 5
        temp = a
        a = b
        b = temp

main()

Using a temporary variable adds extra assignments.

# Recommended. Execution time: 0.06 seconds
def main():
    size = 1000000
    for _ in range(size):
        a = 3
        b = 5
        a, b = b, a  # swap without temp

main()

4.3 Use join for String Concatenation

# Not recommended. Execution time: 2.6 seconds
import string
from typing import List

def concatString(string_list: List[str]) -> str:
    result = ''
    for str_i in string_list:
        result += str_i
    return result

def main():
    string_list = list(string.ascii_letters * 100)
    for _ in range(10000):
        result = concatString(string_list)

main()

Repeated += creates many intermediate strings because strings are immutable.

# Recommended. Execution time: 0.3 seconds
import string
from typing import List

def concatString(string_list: List[str]) -> str:
    return ''.join(string_list)  # use join instead of +

def main():
    string_list = list(string.ascii_letters * 100)
    for _ in range(10000):
        result = concatString(string_list)

main()

5. Use Short‑Circuiting in if Conditions

# Not recommended. Execution time: 0.05 seconds
from typing import List

def concatString(string_list: List[str]) -> str:
    abbreviations = {'cf.', 'e.g.', 'ex.', 'etc.', 'flg.', 'i.e.', 'Mr.', 'vs.'}
    result = ''
    for str_i in string_list:
        if str_i in abbreviations:
            result += str_i
    return result

def main():
    for _ in range(10000):
        string_list = ['Mr.', 'Hat', 'is', 'Chasing', 'the', 'black', 'cat', '.']
        result = concatString(string_list)

main()

Place the most likely true condition first to benefit from short‑circuit evaluation.

# Recommended. Execution time: 0.03 seconds
from typing import List

def concatString(string_list: List[str]) -> str:
    abbreviations = {'cf.', 'e.g.', 'ex.', 'etc.', 'flg.', 'i.e.', 'Mr.', 'vs.'}
    result = ''
    for str_i in string_list:
        if str_i[-1] == '.' and str_i in abbreviations:  # short‑circuit
            result += str_i
    return result

def main():
    for _ in range(10000):
        string_list = ['Mr.', 'Hat', 'is', 'Chasing', 'the', 'black', 'cat', '.']
        result = concatString(string_list)

main()

6. Loop Optimizations

6.1 Replace while with for

# Not recommended. Execution time: 6.7 seconds
def computeSum(size: int) -> int:
    sum_ = 0
    i = 0
    while i < size:
        sum_ += i
        i += 1
    return sum_

def main():
    size = 10000
    for _ in range(size):
        sum_ = computeSum(size)

main()

Python's for loop is faster than a manual while loop.

# Recommended. Execution time: 4.3 seconds
def computeSum(size: int) -> int:
    sum_ = 0
    for i in range(size):  # for replaces while
        sum_ += i
    return sum_

def main():
    size = 10000
    for _ in range(size):
        sum_ = computeSum(size)

main()

6.2 Use Implicit for Loops

Further simplify by using built‑in functions that hide the loop.

# Recommended. Execution time: 1.7 seconds
def computeSum(size: int) -> int:
    return sum(range(size))  # implicit for loop

def main():
    size = 10000
    for _ in range(size):
        sum = computeSum(size)

main()

6.3 Reduce Inner Loop Calculations

# Not recommended. Execution time: 12.8 seconds
import math

def main():
    size = 10000
    sqrt = math.sqrt
    for x in range(size):
        for y in range(size):
            z = sqrt(x) + sqrt(y)

main()

Calling sqrt inside the inner loop repeats the same computation.

# Recommended. Execution time: 7.0 seconds
import math

def main():
    size = 10000
    sqrt = math.sqrt
    for x in range(size):
        sqrt_x = sqrt(x)  # compute once per outer iteration
        for y in range(size):
            z = sqrt_x + sqrt(y)

main()

7. Use numba.jit

Applying @numba.jit compiles Python functions to machine code, dramatically speeding up execution.

# Recommended. Execution time: 0.62 seconds
import numba

@numba.jit
def computeSum(size: float) -> int:
    sum = 0
    for i in range(size):
        sum += i
    return sum

def main():
    size = 10000
    for _ in range(size):
        sum = computeSum(size)

main()

8. Choose Appropriate Data Structures

Python's built‑in structures (list, tuple, set, dict) are implemented in C and are highly optimized. For frequent insertions and deletions, collections.deque offers O(1) operations at both ends. When fast look‑ups or ordered access are needed, use bisect to keep a list sorted for binary search, or heapq to maintain a heap for quick min/max retrieval.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

performanceoptimizationData StructuresBenchmarkingcodenumba
MaGe Linux Operations
Written by

MaGe Linux Operations

Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.