6 Hidden Python Features That Can Double Your Coding Efficiency
This article showcases six powerful yet often overlooked Python standard‑library features—pathlib, contextlib, __slots__, functools.lru_cache, generator pipelines, and dataclasses—demonstrating how they simplify code, boost performance, reduce memory usage, and make scripts more maintainable.
1. pathlib: Say goodbye to os.path's verbosity
Traditional file handling often uses os and shutil. With pathlib, paths become objects that support chainable operations, making code more concise and intuitive.
import os
import shutil
# Find and move all PDF files
for root, dirs, files in os.walk("downloads"):
for file in files:
if file.endswith(".pdf"):
src = os.path.join(root, file)
dst = os.path.join("organized", file)
shutil.move(src, dst)Using pathlib:
from pathlib import Path
# One‑line solution
for pdf_file in Path("downloads").rglob("*.pdf"):
pdf_file.rename(Path("organized") / pdf_file.name)Typical use cases include batch renaming, existence checks, and creating nested directories:
Path("file.txt").rename("new_name.txt") Path("data.csv").exists() Path("a/b/c").mkdir(parents=True, exist_ok=True)2. contextlib: Elegant resource management
Instead of verbose try...finally blocks, contextlib.contextmanager lets you define reusable context managers.
# Traditional resource handling
db_connection = connect_to_database()
try:
data = db_connection.query("SELECT * FROM users")
process_data(data)
finally:
db_connection.close() # Easy to forget!Custom context manager with contextlib:
from contextlib import contextmanager
@contextmanager
def managed_database(connection_string):
"""Automatically manage the lifecycle of a database connection"""
conn = connect_to_database(connection_string)
try:
yield conn # Hand over control
finally:
conn.close() # Ensure closure
# Usage
with managed_database("postgresql://localhost/mydb") as db:
results = db.query("SELECT * FROM users")
# No need to manually close the connectionUseful scenarios: file operations, database connections, network requests, temporary files.
3. __slots__: Memory‑saving secret weapon
Creating millions of small objects can exhaust memory because each instance stores a __dict__. Declaring __slots__ fixes the attribute layout and reduces overhead.
class DataPoint:
def __init__(self, x, y, value):
self.x = x
self.y = y
self.value = value
points = [DataPoint(i, i*2, i**2) for i in range(1_000_000)] # ~200 MBWith __slots__:
class DataPoint:
__slots__ = ('x', 'y', 'value')
def __init__(self, x, y, value):
self.x = x
self.y = y
self.value = value
points = [DataPoint(i, i*2, i**2) for i in range(1_000_000)] # ~120 MB, 40% savedBest suited for massive creation of small objects, performance‑critical apps, and memory‑constrained environments. Note: attributes cannot be added dynamically after the class is defined.
4. functools.lru_cache: Smart caching to avoid repeated work
Repeatedly calling a function with the same arguments leads to unnecessary work. functools.lru_cache memoizes results.
def get_user_data(user_id):
# Each call queries the database
return query_database(f"SELECT * FROM users WHERE id = {user_id}")
for _ in range(100):
data = get_user_data(123) # 100 database queries!Optimized with lru_cache:
from functools import lru_cache
@lru_cache(maxsize=128) # Cache the most recent 128 distinct calls
def get_user_data(user_id):
print(f"Querying database: user_{user_id}")
return query_database(f"SELECT * FROM users WHERE id = {user_id}")
data1 = get_user_data(123) # First call prints query
data2 = get_user_data(123) # Cached, no print
data3 = get_user_data(123) # Cached againApplicable to API calls, heavy computations, configuration parsing, and database query caching.
5. Generator pipelines: Memory‑friendly large‑data processing
Reading an entire multi‑gigabyte log file into memory can crash the program. Using generators enables lazy, line‑by‑line processing.
# Naïve approach – loads all lines into memory
def process_log_file(filename):
with open(filename) as f:
lines = f.readlines()
results = []
for line in lines:
if "ERROR" in line:
results.append(line.strip())
return resultsGenerator‑based pipeline:
def read_lines(filename):
"""Yield lines one by one"""
with open(filename) as f:
for line in f:
yield line
def filter_errors(lines):
"""Yield only error lines"""
for line in lines:
if "ERROR" in line:
yield line
def clean_logs(lines):
"""Strip whitespace from each line"""
for line in lines:
yield line.strip()
log_file = "app.log"
for error in clean_logs(filter_errors(read_lines(log_file))):
process_error(error) # Memory usage stays constantAdvantages: high memory efficiency, strong composability, and immediate processing without waiting for the whole dataset.
6. dataclasses: Eliminate boilerplate code
Manually writing __init__, __repr__, __eq__, etc., is repetitive. The dataclasses decorator generates these automatically.
from dataclasses import dataclass, field
from typing import List
from datetime import datetime
@dataclass(order=True)
class Task:
task_id: int
name: str
priority: int = 1
status: str = "pending"
created_at: datetime = field(default_factory=datetime.now)
tags: List[str] = field(default_factory=list)
# Instances are created succinctly
task1 = Task(1, "Fix bug", priority=3)
task2 = Task(2, "Write docs")Additional dataclass features demonstrated: post‑initialisation processing, frozen (immutable) instances, and a lightweight replacement for namedtuple with full type annotations.
Conclusion
Reflecting on years of Python development, the key lessons are that the standard library offers powerful, elegant solutions; concise code improves maintainability; and performance gains often stem from deep language‑feature knowledge rather than complex algorithms.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Data STUDIO
Click to receive the "Python Study Handbook"; reply "benefit" in the chat to get it. Data STUDIO focuses on original data science articles, centered on Python, covering machine learning, data analysis, visualization, MySQL and other practical knowledge and project case studies.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
