Fundamentals 11 min read

6 Hidden Python Features That Can Double Your Coding Efficiency

This article showcases six powerful yet often overlooked Python standard‑library features—pathlib, contextlib, __slots__, functools.lru_cache, generator pipelines, and dataclasses—demonstrating how they simplify code, boost performance, reduce memory usage, and make scripts more maintainable.

Data STUDIO
Data STUDIO
Data STUDIO
6 Hidden Python Features That Can Double Your Coding Efficiency

1. pathlib: Say goodbye to os.path's verbosity

Traditional file handling often uses os and shutil. With pathlib, paths become objects that support chainable operations, making code more concise and intuitive.

import os
import shutil
# Find and move all PDF files
for root, dirs, files in os.walk("downloads"):
    for file in files:
        if file.endswith(".pdf"):
            src = os.path.join(root, file)
            dst = os.path.join("organized", file)
            shutil.move(src, dst)

Using pathlib:

from pathlib import Path
# One‑line solution
for pdf_file in Path("downloads").rglob("*.pdf"):
    pdf_file.rename(Path("organized") / pdf_file.name)

Typical use cases include batch renaming, existence checks, and creating nested directories:

Path("file.txt").rename("new_name.txt")
Path("data.csv").exists()
Path("a/b/c").mkdir(parents=True, exist_ok=True)

2. contextlib: Elegant resource management

Instead of verbose try...finally blocks, contextlib.contextmanager lets you define reusable context managers.

# Traditional resource handling
db_connection = connect_to_database()
try:
    data = db_connection.query("SELECT * FROM users")
    process_data(data)
finally:
    db_connection.close()  # Easy to forget!

Custom context manager with contextlib:

from contextlib import contextmanager

@contextmanager
def managed_database(connection_string):
    """Automatically manage the lifecycle of a database connection"""
    conn = connect_to_database(connection_string)
    try:
        yield conn  # Hand over control
    finally:
        conn.close()  # Ensure closure

# Usage
with managed_database("postgresql://localhost/mydb") as db:
    results = db.query("SELECT * FROM users")
    # No need to manually close the connection

Useful scenarios: file operations, database connections, network requests, temporary files.

3. __slots__: Memory‑saving secret weapon

Creating millions of small objects can exhaust memory because each instance stores a __dict__. Declaring __slots__ fixes the attribute layout and reduces overhead.

class DataPoint:
    def __init__(self, x, y, value):
        self.x = x
        self.y = y
        self.value = value

points = [DataPoint(i, i*2, i**2) for i in range(1_000_000)]  # ~200 MB

With __slots__:

class DataPoint:
    __slots__ = ('x', 'y', 'value')
    def __init__(self, x, y, value):
        self.x = x
        self.y = y
        self.value = value

points = [DataPoint(i, i*2, i**2) for i in range(1_000_000)]  # ~120 MB, 40% saved

Best suited for massive creation of small objects, performance‑critical apps, and memory‑constrained environments. Note: attributes cannot be added dynamically after the class is defined.

4. functools.lru_cache: Smart caching to avoid repeated work

Repeatedly calling a function with the same arguments leads to unnecessary work. functools.lru_cache memoizes results.

def get_user_data(user_id):
    # Each call queries the database
    return query_database(f"SELECT * FROM users WHERE id = {user_id}")

for _ in range(100):
    data = get_user_data(123)  # 100 database queries!

Optimized with lru_cache:

from functools import lru_cache

@lru_cache(maxsize=128)  # Cache the most recent 128 distinct calls
def get_user_data(user_id):
    print(f"Querying database: user_{user_id}")
    return query_database(f"SELECT * FROM users WHERE id = {user_id}")

data1 = get_user_data(123)  # First call prints query
data2 = get_user_data(123)  # Cached, no print
data3 = get_user_data(123)  # Cached again

Applicable to API calls, heavy computations, configuration parsing, and database query caching.

5. Generator pipelines: Memory‑friendly large‑data processing

Reading an entire multi‑gigabyte log file into memory can crash the program. Using generators enables lazy, line‑by‑line processing.

# Naïve approach – loads all lines into memory
def process_log_file(filename):
    with open(filename) as f:
        lines = f.readlines()
    results = []
    for line in lines:
        if "ERROR" in line:
            results.append(line.strip())
    return results

Generator‑based pipeline:

def read_lines(filename):
    """Yield lines one by one"""
    with open(filename) as f:
        for line in f:
            yield line

def filter_errors(lines):
    """Yield only error lines"""
    for line in lines:
        if "ERROR" in line:
            yield line

def clean_logs(lines):
    """Strip whitespace from each line"""
    for line in lines:
        yield line.strip()

log_file = "app.log"
for error in clean_logs(filter_errors(read_lines(log_file))):
    process_error(error)  # Memory usage stays constant

Advantages: high memory efficiency, strong composability, and immediate processing without waiting for the whole dataset.

6. dataclasses: Eliminate boilerplate code

Manually writing __init__, __repr__, __eq__, etc., is repetitive. The dataclasses decorator generates these automatically.

from dataclasses import dataclass, field
from typing import List
from datetime import datetime

@dataclass(order=True)
class Task:
    task_id: int
    name: str
    priority: int = 1
    status: str = "pending"
    created_at: datetime = field(default_factory=datetime.now)
    tags: List[str] = field(default_factory=list)

# Instances are created succinctly
task1 = Task(1, "Fix bug", priority=3)
task2 = Task(2, "Write docs")

Additional dataclass features demonstrated: post‑initialisation processing, frozen (immutable) instances, and a lightweight replacement for namedtuple with full type annotations.

Conclusion

Reflecting on years of Python development, the key lessons are that the standard library offers powerful, elegant solutions; concise code improves maintainability; and performance gains often stem from deep language‑feature knowledge rather than complex algorithms.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

PerformancePythonlru_cacheGeneratorspathlibdataclassescontextlib__slots__
Data STUDIO
Written by

Data STUDIO

Click to receive the "Python Study Handbook"; reply "benefit" in the chat to get it. Data STUDIO focuses on original data science articles, centered on Python, covering machine learning, data analysis, visualization, MySQL and other practical knowledge and project case studies.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.