grafeo-memory

AI memory layer powered by GrafeoDB, an embedded graph database with native vector search.

No servers, no Docker, no Neo4j, no Qdrant. One .db file + one LLM.

Typical memory stack: Containers with Neo4j + Qdrant, Embedding API + LLM
grafeo-memory stack:  grafeo (single file) + LLM

Install

uv add grafeo-memory                   # base (bring your own LLM + embedder)
uv add grafeo-memory[mistral]          # + Mistral embeddings
uv add grafeo-memory[openai]           # + OpenAI embeddings
uv add grafeo-memory[anthropic]        # + Anthropic embeddings
uv add grafeo-memory[mcp]             # + MCP server for AI agents
uv add grafeo-memory[all]              # all providers

Or with pip:

pip install grafeo-memory[openai]

Quick Start

OpenAI

from openai import OpenAI
from grafeo_memory import MemoryManager, MemoryConfig, OpenAIEmbedder

embedder = OpenAIEmbedder(OpenAI())
config = MemoryConfig(db_path="./memory.db", user_id="alice")

with MemoryManager("openai:gpt-4o-mini", config, embedder=embedder) as memory:
    # Add memories from conversation
    events = memory.add("I just started a new job at Acme Corp as a data scientist")
    # -> [ADD "alice works at acme_corp", ADD "alice is a data_scientist"]

    events = memory.add("I've been promoted to senior data scientist at Acme")
    # -> [UPDATE "alice is a senior data scientist at acme_corp"]

    events = memory.add("I left Acme and joined Beta Inc")
    # -> [DELETE "alice works at acme_corp", ADD "alice works at beta_inc"]

    # Search
    results = memory.search("Where does Alice work?")
    # -> [SearchResult(text="alice works at beta_inc", score=0.92, ...)]

Mistral

from mistralai import Mistral
from grafeo_memory import MemoryManager, MemoryConfig, MistralEmbedder

embedder = MistralEmbedder(Mistral())
config = MemoryConfig(db_path="./memory.db", user_id="alice")

with MemoryManager("mistral:mistral-small-latest", config, embedder=embedder) as memory:
    events = memory.add("I just started a new job at Acme Corp as a data scientist")
    results = memory.search("Where does Alice work?")

How It Works

grafeo-memory implements the reconciliation loop, the intelligence layer that decides what to remember:

Extract facts from conversation text (LLM call)
Extract entities and relationships (LLM tool call)
Search existing memory for related facts (vector + graph)
Reconcile new facts against existing memory (LLM decides ADD/UPDATE/DELETE/NONE)
Execute the decisions against GrafeoDB

┌──────────────────────────────────────────┐
│             grafeo-memory                │
│                                          │
│  Extractor -> Reconciler -> Executor     │
│  (LLM)       (LLM)        (GrafeoDB)     │
└──────────────────┬───────────────────────┘
                   │
         ┌─────────┴──────────┐
         │      GrafeoDB      │
         │  Graph + Vector    │
         │  + Text (optional) │
         │  single .db file   │
         └────────────────────┘

Multi-User Isolation

config = MemoryConfig(db_path="./chat_memory.db")

with MemoryManager("openai:gpt-4o-mini", config, embedder=embedder) as memory:
    # Each user's memories are isolated
    memory.add("I love hiking in the mountains", user_id="bob")
    memory.add("I prefer beach vacations", user_id="carol")

    bob_results = memory.search("vacation preferences", user_id="bob")
    # -> hiking, mountains

    carol_results = memory.search("vacation preferences", user_id="carol")
    # -> beach vacations

Supported LLM Providers

grafeo-memory uses pydantic-ai model strings, so any provider pydantic-ai supports works out of the box:

# OpenAI — use OpenAIEmbedder for embeddings
MemoryManager("openai:gpt-4o-mini", config, embedder=OpenAIEmbedder(OpenAI()))

# Anthropic — pair with OpenAI or custom embedder
MemoryManager("anthropic:claude-sonnet-4-5-20250929", config, embedder=embedder)

# Groq — pair with OpenAI or custom embedder
MemoryManager("groq:llama-3.3-70b-versatile", config, embedder=embedder)

# Mistral — use MistralEmbedder for embeddings
MemoryManager("mistral:mistral-small-latest", config, embedder=MistralEmbedder(Mistral()))

# Google — pair with OpenAI or custom embedder
MemoryManager("google-gla:gemini-2.0-flash", config, embedder=embedder)

Built-in Embedders

Class	Provider	Default Model	Install Extra
`OpenAIEmbedder`	OpenAI	`text-embedding-3-small`	`[openai]`
`MistralEmbedder`	Mistral	`mistral-embed`	`[mistral]`

Both accept an optional model parameter to override the default.

Custom Embeddings

Implement the EmbeddingClient protocol to use any embedding provider:

from grafeo_memory import EmbeddingClient

class MyEmbedder:
    def embed(self, texts: list[str]) -> list[list[float]]:
        # Call your embedding API
        return [...]

    @property
    def dimensions(self) -> int:
        return 1024  # your model's output dimensions

memory = MemoryManager("openai:gpt-4o-mini", config, embedder=MyEmbedder())

MCP Server

grafeo-memory includes a built-in MCP server so AI agents (Claude Desktop, Cursor, etc.) can use it as a tool.

uv add grafeo-memory[mcp]
# or: pip install grafeo-memory[mcp]

Claude Desktop

Add to claude_desktop_config.json:

{
  "mcpServers": {
    "grafeo-memory": {
      "command": "grafeo-memory-mcp",
      "env": {
        "GRAFEO_MEMORY_MODEL": "openai:gpt-4o-mini",
        "GRAFEO_MEMORY_DB": "./memory.db"
      }
    }
  }
}

Available Tools

Tool	Description
`memory_add`	Add a memory by extracting facts from text
`memory_add_batch`	Add multiple memories in one batch
`memory_search`	Search memories by semantic similarity and graph context
`memory_update`	Update an existing memory's text
`memory_delete`	Delete a single memory
`memory_delete_all`	Delete all memories for a user
`memory_list`	List all stored memories
`memory_summarize`	Consolidate old memories into topic-grouped summaries
`memory_history`	Show change history for a memory

Environment Variables

Variable	Default	Description
`GRAFEO_MEMORY_MODEL`	`openai:gpt-4o-mini`	pydantic-ai model string
`GRAFEO_MEMORY_DB`	(in-memory)	Database file path
`GRAFEO_MEMORY_USER`	`default`	Default user ID
`GRAFEO_MEMORY_YOLO`	(off)	Set to `1` for all features

Transport

Supports stdio (default), SSE and streamable HTTP:

grafeo-memory-mcp              # stdio (default)
grafeo-memory-mcp sse          # SSE
grafeo-memory-mcp streamable-http

Note: This is different from grafeo-mcp, which exposes the raw GrafeoDB database. grafeo-memory-mcp wraps the high-level memory API (extract, reconcile, search, summarize).

Observability

grafeo-memory supports OpenTelemetry instrumentation via pydantic-ai. When enabled, all LLM calls (extraction, reconciliation, summarization, reranking) are traced automatically.

config = MemoryConfig(instrument=True)  # uses global OTel provider

For custom providers:

from grafeo_memory import InstrumentationSettings

config = MemoryConfig(instrument=InstrumentationSettings(
    tracer_provider=my_tracer_provider,
    include_content=False,
))

Why grafeo-memory?

	Traditional stack	grafeo-memory
Infrastructure	Neo4j + Qdrant (Docker)	Single .db file
Install size	~750MB (Docker images)	~16MB (uv add)
Offline/edge	Requires servers	Yes
Graph + vector	Separate services	Unified engine
LLM providers	Varies	pydantic-ai (OpenAI, Anthropic, Mistral, Groq, Google)
Embeddings	External API required	Protocol-based (any provider)

API Reference

`MemoryManager`

MemoryManager(model, config=None, *, embedder): create memory manager. model is a pydantic-ai model string (e.g. "openai:gpt-4o-mini")
.add(messages, user_id=None, session_id=None, metadata=None, *, infer=True, importance=1.0, memory_type="semantic") → AddResult (list of MemoryEvent)
.search(query, user_id=None, k=10, *, filters=None, rerank=True, memory_type=None) → SearchResponse (list of SearchResult)
.update(memory_id, text) → MemoryEvent: update a memory's text directly
.get_all(user_id=None, memory_type=None) → list[SearchResult]
.delete(memory_id) → bool
.delete_all(user_id=None) → int (count deleted)
.summarize(user_id=None, *, preserve_recent=5, batch_size=20) → AddResult
.history(memory_id) → list[HistoryEntry]
.set_importance(memory_id, importance) → bool
.close(): close the database

Use as a context manager: with MemoryManager(...) as memory:. Multiple sessions in the same process are supported.

`MemoryConfig`

db_path: path to database file (None for in-memory)
user_id: default user scope (default "default")
session_id: default session scope
agent_id: default agent scope
similarity_threshold: max embedding distance for reconciliation (default 0.7)
embedding_dimensions: vector dimensions (default 1536)
enable_importance: enable composite scoring with recency/frequency/importance (default False)
weight_topology: topology score weight for graph-connected memories (default 0.0, requires enable_importance)
enable_topology_boost: re-rank search results by graph connectivity, no LLM call (default False)
topology_boost_factor: strength of topology boost (default 0.2)
consolidation_protect_threshold: protect well-connected memories from summarize (default 0.0, off)
instrument: OpenTelemetry instrumentation, True or InstrumentationSettings (default False)

`EmbeddingClient` (Protocol)

.embed(texts: list[str]) -> list[list[float]]: generate embeddings for a batch of texts
.dimensions -> int: return the embedding vector dimensionality

Return Types

AddResult: list subclass of MemoryEvent, with .usage for LLM token counts
SearchResponse: list subclass of SearchResult, with .usage for LLM token counts
MemoryEvent: .action (ADD/UPDATE/DELETE/NONE), .memory_id, .text, .old_text
SearchResult: .memory_id, .text, .score, .user_id, .metadata, .relations, .memory_type
HistoryEntry: .event, .old_text, .new_text, .timestamp, .actor_id, .role

Iteration

# AddResult is iterable:
for event in memory.add("text"):
    print(event.action, event.text)

# SearchResponse is iterable:
for result in memory.search("query"):
    print(result.text, result.score)

Ecosystem

grafeo-memory is part of the GrafeoDB ecosystem:

grafeo: Core graph database engine (Rust)
grafeo-langchain: LangChain integration
grafeo-llamaindex: LlamaIndex integration
grafeo-mcp: MCP server for raw GrafeoDB access
grafeo-memory-mcp (built-in): MCP server for the memory API (uv add grafeo-memory[mcp] or pip install grafeo-memory[mcp])

All packages share the same .db file. Build memories with grafeo-memory, query them with grafeo-langchain, expose them via MCP.

Requirements

Python 3.12+

License

Apache-2.0

Name		Name	Last commit message	Last commit date
Latest commit History 57 Commits
.github/workflows		.github/workflows
examples		examples
src/grafeo_memory		src/grafeo_memory
tests		tests
.env.example		.env.example
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENCE		LICENCE
README.md		README.md
coverage.xml		coverage.xml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

grafeo-memory

Install

Quick Start

OpenAI

Mistral

How It Works

Multi-User Isolation

Supported LLM Providers

Built-in Embedders

Custom Embeddings

MCP Server

Claude Desktop

Available Tools

Environment Variables

Transport

Observability

Why grafeo-memory?

API Reference

`MemoryManager`

`MemoryConfig`

`EmbeddingClient` (Protocol)

Return Types

Iteration

Ecosystem

Requirements

License

About

Uh oh!

Releases 5

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

grafeo-memory

Install

Quick Start

OpenAI

Mistral

How It Works

Multi-User Isolation

Supported LLM Providers

Built-in Embedders

Custom Embeddings

MCP Server

Claude Desktop

Available Tools

Environment Variables

Transport

Observability

Why grafeo-memory?

API Reference

MemoryManager

MemoryConfig

EmbeddingClient (Protocol)

Return Types

Iteration

Ecosystem

Requirements

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 5

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`MemoryManager`

`MemoryConfig`

`EmbeddingClient` (Protocol)

Packages