Real-time safety for knowledge agents. Groundedness verification, privacy redaction, and context compression — without replacing your stack.
PyPI · Quickstart · Integrations · Examples · Website
pip install latencefrom latence import Latence
trace = Latence(api_key="lat_...")
score = trace.grounding.rag(
query="Can I promise this customer a refund?",
response_text="Yes, the refund will arrive within 48 hours.",
raw_context="Refunds require manual finance approval before timelines are promised.",
)
print(score.risk_band)
print(score.runtime_decision)- Groundedness — verify RAG answers against retrieved context.
- Compression — reduce long context to what matters.
- Privacy — redact private data before it spreads through tools, logs, or prompts.
export LATENCE_TRACE_API_KEY="lat_..."from latence import Latence
trace = Latence()The SDK connects to https://api.latence.ai by default. Override with base_url or LATENCE_TRACE_URL.
# Groundedness scoring
score = trace.grounding.rag(query=..., response_text=..., raw_context=...)
# Privacy redaction
redacted = trace.privacy.redact(text=...)
# Context compression
compressed = trace.compression.text(text=..., compression_rate=0.4)Latence is synchronous. AsyncLatence exposes the same surface for asyncio services.
Base dependencies are only httpx and pydantic.
Direct calls are the recommended path. Optional helpers live under
latence.integrations.
pip install "latence[langchain]"
pip install "latence[llama_index]"
pip install "latence[openai]"
pip install "latence[haystack]"Docs: Integrations