FuzzyBunny

A high-performance, lightweight Python library for fuzzy string matching and ranking, implemented in C++ with Pybind11.

Features

Blazing Fast: C++ core for 2-5x speed improvement over pure Python alternatives.
Multiple Scorers: Support for Levenshtein, Jaccard, and Token Sort ratios.
Partial Matching: Find the best substring matches.
Hybrid Scoring: Combine multiple scorers with custom weights.
Pandas & NumPy Integration: Native support for Series and Arrays.
Batch Processing: Parallelized matching for large datasets using OpenMP.
Unicode Support: Handles international characters and normalization.
Benchmarking Tools: Built-in utilities to measure performance.

Installation

pip install fuzzybunny

Quick Start

import fuzzybunny

# Basic matching
score = fuzzybunny.levenshtein("kitten", "sitting")
print(f"Similarity: {score:.2f}")

# Ranking candidates
candidates = ["apple", "apricot", "banana", "cherry"]
results = fuzzybunny.rank("app", candidates, top_n=2)
# [('apple', 0.6), ('apricot', 0.42)]

Advanced Usage

Hybrid Scorer

Combine different algorithms to get better results:

results = fuzzybunny.rank(
    "apple banana", 
    ["banana apple"], 
    scorer="hybrid", 
    weights={"levenshtein": 0.3, "token_sort": 0.7}
)

Pandas Integration

Use the specialized accessor for clean code:

import pandas as pd
import fuzzybunny

df = pd.DataFrame({"names": ["apple pie", "banana bread", "cherry tart"]})
results = df["names"].fuzzy.match("apple", mode="partial")

Benchmarking

Compare performance on your specific data:

perf = fuzzybunny.benchmark("query", candidates)
print(f"Levenshtein mean time: {perf['levenshtein']['mean']:.6f}s")

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.github/workflows		.github/workflows
docs/assets		docs/assets
src		src
tests		tests
.gitignore		.gitignore
GEMINI.md		GEMINI.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
setup.py		setup.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FuzzyBunny

Features

Installation

Quick Start

Advanced Usage

Hybrid Scorer

Pandas Integration

Benchmarking

License

About

Uh oh!

Releases 3

Packages

Languages

License

cachevector/fuzzybunny

Folders and files

Latest commit

History

Repository files navigation

FuzzyBunny

Features

Installation

Quick Start

Advanced Usage

Hybrid Scorer

Pandas Integration

Benchmarking

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Languages

Packages