Information Security 17 min read

10 Hidden Python Security Pitfalls Every Developer Must Avoid

This article reveals ten little‑known Python security pitfalls—from optimized‑away asserts and directory permission quirks to URL parsing quirks and Unicode normalization issues—explaining each flaw, its impact, and how to mitigate it, helping developers safeguard their code.

Python Crawling & Data Mining

Jul 31, 2022

10 Hidden Python Security Pitfalls Every Developer Must Avoid

Author: Dennis Brinkrolf | Translator: 豌豆花下猫@Python猫 (CC BY‑NC‑SA 4.0)

1. Optimized‑away assert statements

When Python code is run with optimization, all assert statements are stripped out. If developers rely on assert for security checks (e.g., authentication), the check disappears, allowing unauthorized actions.

def superuser_action(request, user):
    assert user.is_super_user
    # execute action as super user

2. os.makedirs permission quirks

The mode argument of os.makedirs sets default permissions. In Python < 3.6, all created directories inherit the mode (e.g., 0o700). From Python 3.6 onward, only the deepest directory gets the specified mode; parent directories receive the default 0o755. This version difference caused a permission‑escalation bug in Django (CVE‑2022‑24583) and a similar issue in WordPress.

def init_directories(request):
    os.makedirs("A/B/C", mode=0o700)
    return HttpResponse("Done!")

3. Absolute‑path hijacking with os.path.join

If any component passed to os.path.join starts with /, all preceding components are discarded, producing an absolute path. Attackers can bypass path‑traversal checks that only look for . characters.

def read_file(request):
    filename = request.POST['filename']
    file_path = os.path.join("var", "lib", filename)
    if file_path.find(".") != -1:
        return HttpResponse("Failed!")
    with open(file_path) as f:
        return HttpResponse(f.read(), content_type='text/plain')

4. Uncontrolled temporary files

tempfile.NamedTemporaryFile

accepts prefix and suffix arguments that can be manipulated for path‑traversal attacks. An attacker can create temporary files in arbitrary locations.

def touch_tmp_file(request):
    id = request.GET['id']
    tmp_file = tempfile.NamedTemporaryFile(prefix=id)
    return HttpResponse(f"tmp file: {tmp_file} created!", content_type='text/plain')

5. Extended Zip Slip

Extracting zip archives with ZipFile without sanitising entry names can lead to arbitrary file writes. Only zipfile.extract and extractall perform sanitisation; other methods like read followed by manual open do not.

def extract_html(request):
    filename = request.FILES['filename']
    zf = zipfile.ZipFile(filename.temporary_file_path(), "r")
    for entry in zf.namelist():
        if entry.endswith('.html'):
            file_content = zf.read(entry)
            with open(entry, "wb") as fp:
                fp.write(file_content)
    zf.close()
    return HttpResponse("HTML files extracted!")

6. Incomplete regex matching

Using re.match to detect malicious patterns can be bypassed because it does not search across newlines, unlike re.search. This makes simple blacklist regexes ineffective.

def is_sql_injection(request):
    pattern = re.compile(r".*(union)|(select).*")
    name_to_test = request.GET['name']
    if re.search(pattern, name_to_test):
        return True
    return False

7. Unicode normalisation bypass

Normalising input with unicodedata.normalize('NFKC', ...) can transform encoded characters into HTML tags, bypassing prior escaping and leading to XSS.

def render_input(request):
    user_input = escape(request.GET['p'])
    normalized_user_input = unicodedata.normalize('NFKC', user_input)
    context = {'my_input': normalized_user_input}
    return render(request, 'test.html', context)

8. Unicode code‑point collisions

Different Unicode characters can map to the same visual representation. An attacker can exploit this by using Turkish dot‑less ı in an email address, causing a case‑insensitive lookup to succeed while the actual stored address differs, leading to password‑reset emails being sent to the attacker.

def reset_pw(request):
    email = request.GET['email']
    result = User.objects.filter(email__exact=email.upper()).first()
    if not result:
        return HttpResponse("User not found!")
    send_mail('Reset Password', 'Your new pw: 123456.', '[email protected]', [email], fail_silently=False)
    return HttpResponse("Password reset email sent!")

9. IP address normalisation

Python ipaddress.IPv4Address normalises addresses, stripping leading zeros. An attacker can supply 127.0.001, which normalises to 127.0.0.1, bypassing blacklist checks and enabling SSRF.

def send_request(request):
    ip = request.GET['ip']
    try:
        if ip in ["127.0.0.1", "0.0.0.0"]:
            return HttpResponse("Not allowed!")
        ip = str(ipaddress.IPv4Address(ip))
    except ipaddress.AddressValueError:
        return HttpResponse("Error at validation!")
    requests.get('https://' + ip)
    return HttpResponse("Request sent!")

10. URL query‑parameter parsing differences

Prior to Python 3.7, urllib.parse.parse_qsl treats both ; and & as separators. When a front‑end (e.g., PHP) does not recognise ;, the entire query string is passed to a Python back‑end, which then splits it into separate parameters, potentially causing request‑parameter injection vulnerabilities (e.g., Django cache‑poisoning CVE‑2021‑23336).

In this article we presented ten subtle Python security pitfalls that are easy to overlook but have caused real‑world vulnerabilities. Developers should upgrade libraries, read documentation carefully, and audit code for these patterns.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

code review best practices security vulnerabilities

Written by

Python Crawling & Data Mining

Life's short, I code in Python. This channel shares Python web crawling, data mining, analysis, processing, visualization, automated testing, DevOps, big data, AI, cloud computing, machine learning tools, resources, news, technical articles, tutorial videos and learning materials. Join us!

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.