Artificial Intelligence 3 min read

Fuzzy String Matching in Python with difflib, fuzzywuzzy, and TheFuzz

This article demonstrates how to perform fuzzy string matching in Python using three libraries—difflib, fuzzywuzzy (with python-Levenshtein), and TheFuzz—by defining matching functions, installing required packages, and providing example code that finds the best match for a given pattern among a list of candidate strings.

Test Development Learning Exchange
Test Development Learning Exchange
Test Development Learning Exchange
Fuzzy String Matching in Python with difflib, fuzzywuzzy, and TheFuzz

difflib

import difflib

def fuzzy_match_difflib(pattern, sequence):
    best_match = None
    best_ratio = 0
    for s in sequence:
        ratio = difflib.SequenceMatcher(None, pattern, s).ratio()
        if ratio > best_ratio:
            best_ratio = ratio
            best_match = s
    return best_match

# Example
patterns = ["hello"]
sequences = ["helllo", "world", "python", "hell", "wrld"]
for pattern in patterns:
    match = fuzzy_match_difflib(pattern, sequences)
    print(f"搜索 '{pattern}' 的最佳匹配是: {match}")

fuzzywuzzy

pip install fuzzywuzzy python-Levenshtein

from fuzzywuzzy import fuzz

def fuzzy_match_fuzzywuzzy(pattern, sequence):
    best_match = None
    best_ratio = 0
    for s in sequence:
        ratio = fuzz.ratio(pattern, s)
        if ratio > best_ratio:
            best_ratio = ratio
            best_match = s
    return best_match

# Example
patterns = ["hello"]
sequences = ["helllo", "world", "python", "hell", "wrld"]
for pattern in patterns:
    match = fuzzy_match_fuzzywuzzy(pattern, sequences)
    print(f"搜索 '{pattern}' 的最佳匹配是: {match}")

TheFuzz

pip install thefuzz

from thefuzz import fuzz

def fuzzy_match_thefuzz(pattern, sequence):
    best_match = None
    best_ratio = 0
    for s in sequence:
        ratio = fuzz.ratio(pattern, s)
        if ratio > best_ratio:
            best_ratio = ratio
            best_match = s
    return best_match

# Example
patterns = ["hello"]
sequences = ["helllo", "world", "python", "hell", "wrld"]
for pattern in patterns:
    match = fuzzy_match_thefuzz(pattern, sequences)
    print(f"搜索 '{pattern}' 的最佳匹配是: {match}")

In the three examples we define a function that iterates over a list of candidate strings, computes a similarity score using the chosen library, and returns the string with the highest score as the best match; you can choose any of these libraries based on your performance and dependency preferences.

Pythonfuzzy-matchingdifflibFuzzyWuzzystring similarityTheFuzz
Test Development Learning Exchange
Written by

Test Development Learning Exchange

Test Development Learning Exchange

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.