Skip to content

DataFog

Open-source PII detection for AI agents. Scan, redact, and guard sensitive data — locally, in milliseconds.

from datafog import sanitize

sanitize("Call Sarah Chen at 415-555-0142, SSN 234-56-7890")
# → "Call [PERSON_1] at [PHONE_1], SSN [SSN_1]"

Projects

🔒 datafog-python — The core SDK

PII detection and redaction via regex + NLP cascade. One function call. <2MB core install. 190x faster than spaCy for structured PII.

pip install datafog

🔌 datafog-mcp — MCP privacy proxy (coming soon)

Add PII detection to any MCP server with one config change. Wraps Postgres, filesystem, Slack, and other MCP servers — intercepts tool responses before PII enters the agent's context window.

uvx datafog-mcp proxy --wrap <your-mcp-server>

🧪 datafog-core — Rust engine (in development)

High-performance detection core in Rust. Will power both the Python SDK (via PyO3) and native integrations.

Use cases

Agent guardrails — Wrap LLM calls with scan_prompt() / filter_output() to catch PII before it enters or leaves your agent.

MCP privacy layer — Proxy any MCP server so tool responses are automatically scanned. Your agent reasons over [PERSON_1] instead of real names.

CI/CD scanningdatafog scan ./data catches PII in test fixtures, logs, and configs before they ship.

RAG sanitization — Scrub retrieved chunks before injecting into prompts.

Links

🌐 datafog.ai · 📦 PyPI · 💬 Discord · 𝕏 @datafoginc

Popular repositories Loading

  1. datafog-python datafog-python Public

    Python SDK for PII detection and redaction in text and images, combining regex + NLP pipelines for production privacy workflows.

    Python 42 11

  2. datafog-instructor datafog-instructor Public

    Python 14 1

  3. codexify codexify Public archive

    An open-source API that identifies, masks, and replaces Personallly Identifying Information (PII)

    Python 10 1

  4. vlm-api vlm-api Public

    REST API for computing cross-modal similarity between images and text using the ColPaLI vision-language model

    Python 7 1

  5. datafog-ollama-demo datafog-ollama-demo Public

    Streamlit web demo using datafog-instructor and Ollama

    Python 3 2

  6. datafog-api datafog-api Public

    Privacy Engineering for the Generative AI era made available through a REST API

    Python 2

Repositories

Showing 10 of 11 repositories

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…