Python Code Optimization Techniques for Faster Execution
This article presents a comprehensive guide to accelerating Python code by applying optimization principles such as avoiding global variables, minimizing attribute access, eliminating unnecessary abstractions and data copies, leveraging built‑in functions like join, using loop optimizations, and employing tools like numba.jit, with concrete code examples and performance measurements.
This article explains various techniques to speed up pure Python programs. It starts with three basic optimization principles: avoid premature optimization, weigh the cost of optimization, and focus on the parts of code that actually affect performance.
0. Code Optimization Principles
Before diving into specific tricks, understand that optimization should only be applied after the code works correctly and the performance bottlenecks are identified.
1. Avoid Global Variables
# Not recommended. Execution time: 26.8 s
import math
size = 10000
for x in range(size):
for y in range(size):
z = math.sqrt(x) + math.sqrt(y)Placing the script inside a function reduces the runtime by 15‑30% because local variable lookup is faster.
# Recommended. Execution time: 20.6 s
import math
def main():
size = 10000
for x in range(size):
for y in range(size):
z = math.sqrt(x) + math.sqrt(y)
main()2. Avoid Attribute Access
2.1 Avoid Module and Function Attribute Access
# Not recommended. Execution time: 14.5 s
import math
def computeSqrt(size: int):
result = []
for i in range(size):
result.append(math.sqrt(i))
return result
def main():
size = 10000
for _ in range(size):
result = computeSqrt(size)
main()Using from math import sqrt eliminates the overhead of __getattribute__ and __getattr__ calls.
# First optimization. Execution time: 10.9 s
from math import sqrt
def computeSqrt(size: int):
result = []
for i in range(size):
result.append(sqrt(i))
return result
def main():
size = 10000
for _ in range(size):
result = computeSqrt(size)
main()Further speed‑up is achieved by caching sqrt in a local variable.
# Second optimization. Execution time: 9.9 s
import math
def computeSqrt(size: int):
result = []
sqrt = math.sqrt
for i in range(size):
result.append(sqrt(i))
return result
def main():
size = 10000
for _ in range(size):
result = computeSqrt(size)
main()2.2 Avoid Class Attribute Access
# Not recommended. Execution time: 10.4 s
import math
from typing import List
class DemoClass:
def __init__(self, value: int):
self._value = value
def computeSqrt(self, size: int) -> List[float]:
result = []
append = result.append
sqrt = math.sqrt
for _ in range(size):
append(sqrt(self._value))
return result
def main():
size = 10000
for _ in range(size):
demo_instance = DemoClass(size)
result = demo_instance.computeSqrt(size)
main()Assigning frequently used attributes to local variables (e.g., value = self._value) removes the extra attribute lookup cost.
# Recommended. Execution time: 8.0 s
import math
from typing import List
class DemoClass:
def __init__(self, value: int):
self._value = value
def computeSqrt(self, size: int) -> List[float]:
result = []
append = result.append
sqrt = math.sqrt
value = self._value
for _ in range(size):
append(sqrt(value))
return result
def main():
size = 10000
for _ in range(size):
demo_instance = DemoClass(size)
demo_instance.computeSqrt(size)
main()3. Avoid Unnecessary Abstraction
# Not recommended. Execution time: 0.55 s
class DemoClass:
def __init__(self, value: int):
self.value = value
@property
def value(self) -> int:
return self._value
@value.setter
def value(self, x: int):
self._value = x
def main():
size = 1000000
for i in range(size):
demo_instance = DemoClass(size)
value = demo_instance.value
demo_instance.value = i
main()Removing property decorators and using plain attributes reduces overhead.
# Recommended. Execution time: 0.33 s
class DemoClass:
def __init__(self, value: int):
self.value = value # simple attribute, no getter/setter
def main():
size = 1000000
for i in range(size):
demo_instance = DemoClass(size)
value = demo_instance.value
demo_instance.value = i
main()4. Avoid Unnecessary Data Copying
4.1 Eliminate Meaningless Copies
# Not recommended. Execution time: 6.5 s
def main():
size = 10000
for _ in range(size):
value = range(size)
value_list = [x for x in value]
square_list = [x * x for x in value_list]
main()Creating value_list is wasteful; compute directly from the original iterator.
# Recommended. Execution time: 4.8 s
def main():
size = 10000
for _ in range(size):
value = range(size)
square_list = [x * x for x in value]
main()4.2 Swap Values Without a Temporary Variable
# Not recommended. Execution time: 0.07 s
def main():
size = 1000000
for _ in range(size):
a = 3
b = 5
temp = a
a = b
b = temp
main() # Recommended. Execution time: 0.06 s
def main():
size = 1000000
for _ in range(size):
a = 3
b = 5
a, b = b, a # tuple unpacking, no temp variable
main()4.3 Use join for String Concatenation
# Not recommended. Execution time: 2.6 s
import string
from typing import List
def concatString(string_list: List[str]) -> str:
result = ''
for str_i in string_list:
result += str_i
return result
def main():
string_list = list(string.ascii_letters * 100)
for _ in range(10000):
result = concatString(string_list)
main() # Recommended. Execution time: 0.3 s
import string
from typing import List
def concatString(string_list: List[str]) -> str:
return ''.join(string_list) # join is O(n) and avoids intermediate strings
def main():
string_list = list(string.ascii_letters * 100)
for _ in range(10000):
result = concatString(string_list)
main()5. Exploit Short‑Circuit Behaviour of if
# Not recommended. Execution time: 0.05 s
from typing import List
def concatString(string_list: List[str]) -> str:
abbreviations = {'cf.', 'e.g.', 'ex.', 'etc.', 'flg.', 'i.e.', 'Mr.', 'vs.'}
result = ''
for str_i in string_list:
if str_i in abbreviations:
result += str_i
return result
def main():
for _ in range(10000):
string_list = ['Mr.', 'Hat', 'is', 'Chasing', 'the', 'black', 'cat', '.']
result = concatString(string_list)
main()Placing the most likely‑true condition first lets Python skip the second check thanks to short‑circuit evaluation.
# Recommended. Execution time: 0.03 s
from typing import List
def concatString(string_list: List[str]) -> str:
abbreviations = {'cf.', 'e.g.', 'ex.', 'etc.', 'flg.', 'i.e.', 'Mr.', 'vs.'}
result = ''
for str_i in string_list:
if str_i[-1] == '.' and str_i in abbreviations: # '.' check first
result += str_i
return result
def main():
for _ in range(10000):
string_list = ['Mr.', 'Hat', 'is', 'Chasing', 'the', 'black', 'cat', '.']
result = concatString(string_list)
main()6. Loop Optimizations
6.1 Replace while with for
# Not recommended. Execution time: 6.7 s
def computeSum(size: int) -> int:
sum_ = 0
i = 0
while i < size:
sum_ += i
i += 1
return sum_
def main():
size = 10000
for _ in range(size):
sum_ = computeSum(size)
main() # Recommended. Execution time: 4.3 s
def computeSum(size: int) -> int:
sum_ = 0
for i in range(size):
sum_ += i
return sum_
def main():
size = 10000
for _ in range(size):
sum_ = computeSum(size)
main()6.2 Use Implicit for Loops
# Recommended. Execution time: 1.7 s
def computeSum(size: int) -> int:
return sum(range(size)) # built‑in sum with implicit iteration
def main():
size = 10000
for _ in range(size):
sum_ = computeSum(size)
main()6.3 Reduce Work Inside Inner Loops
# Not recommended. Execution time: 12.8 s
import math
def main():
size = 10000
sqrt = math.sqrt
for x in range(size):
for y in range(size):
z = sqrt(x) + sqrt(y)
main() # Recommended. Execution time: 7.0 s
import math
def main():
size = 10000
sqrt = math.sqrt
for x in range(size):
sqrt_x = sqrt(x) # compute once per outer loop
for y in range(size):
z = sqrt_x + sqrt(y)
main()7. Use numba.jit for JIT Compilation
Applying numba.jit transforms a Python function into machine code, dramatically reducing execution time.
# Recommended. Execution time: 0.62 s
import numba
@numba.jit
def computeSum(size: float) -> int:
sum = 0
for i in range(size):
sum += i
return sum
def main():
size = 10000
for _ in range(size):
sum_ = computeSum(size)
main()8. Choose Appropriate Data Structures
Python’s built‑in containers ( list, tuple, dict, set, etc.) are implemented in C and are highly efficient. For frequent insert/delete operations, collections.deque offers O(1) performance at both ends. When fast look‑ups are needed, bisect can maintain a sorted list for binary search, and heapq provides O(1) access to the smallest element.
References
https://zhuanlan.zhihu.com/p/143052860
David Beazley & Brian K. Jones, *Python Cookbook*, 3rd ed., O'Reilly, 2013.
张颖 & 赖勇浩, *编写高质量代码:改善Python程序的91个建议*, 机械工业出版社, 2014.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Python Programming Learning Circle
A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
