archmax

A semantic layer for your data: archmax describes it, you sharpen it, AI agents query it.

Documentation · Issues · GitHub

Heads up: archmax is experimental. The core ideas are stable, but APIs, file formats, and configuration may change between releases. We try to avoid breaking changes, but can't guarantee stability yet. Pin your version and check the changelog before upgrading.


Graph View	Model Builder	MCP Access

Test Agents	Test Cases	Test Runs

The Problem

Connecting AI agents to databases today is a gamble. You either hand over raw SQL access and hope the LLM doesn't hallucinate column names, run destructive queries, or leak sensitive data, or you spend weeks writing bespoke tool integrations that break the moment your schema changes.

Even when agents can query your database, they have no idea what the data actually means. A column called amt_01 could be revenue, tax, or a refund. A table called dim_cust is meaningless without business context. Without that context, agents guess, and guessing on real data has real consequences.

The gap between "AI can write SQL" and "AI understands our data" is where most agent-database projects stall.

How archmax Solves This

archmax puts a semantic layer between your databases and AI agents. Archmax describes your data; you sharpen it with the context that matters; agents query through the Model Context Protocol (MCP).

Instead of raw database access, agents get:

Business context: field descriptions, synonyms, examples, and enum values so the agent knows amt_01 is "gross revenue in EUR"
Guardrails: read-only queries scoped to sandboxed VIEWs, not raw tables; token-based access with model-level permissions
Federation: a single query interface across Postgres, MySQL, MSSQL, SQLite, DuckDB, and Iceberg REST Catalogs, powered by DuckDB's in-process engine
Structure: typed datasets, explicit relationships, and reusable metric definitions stored as OSI YAML
Token efficiency: OSI models are converted to compressed markdown digests before being sent to agents, reducing token usage by 3–5× compared to raw YAML

The result: AI agents that query your data reliably, safely, and with understanding, not guesswork. The approach is conceptually similar to Snowflake Semantic Views: Both layer business meaning (metrics, dimensions, relationships) over physical tables so consumers get consistent definitions instead of raw column names. The key difference is that archmax is database-agnostic (federating across Postgres, MySQL, MSSQL, SQLite, DuckDB, and Iceberg REST Catalogs).

Built on the Open Semantic Interchange (OSI) spec, an open standard for describing datasets, relationships, and metrics in a vendor-neutral way. archmax uses OSI YAML as its internal storage format for semantic model definitions — every dataset, field, relationship, and metric is persisted as spec-compliant YAML files on disk.

Because the OSI YAML format is verbose and token-intensive, archmax does not serve raw YAML to AI agents. Instead, when an agent requests model information through MCP tools, the OSI model is converted on-the-fly into a compressed markdown digest that preserves all semantically relevant information (field types, descriptions, enums, relationships, examples) while using 3–5× fewer tokens than the equivalent YAML. This makes agent interactions significantly cheaper and faster without sacrificing context quality.

Features

Semantic Models: describe tables as datasets with typed fields, relationships, and metrics in YAML
MCP Server: expose your semantic layer to any MCP-compatible AI agent (Claude, Cursor, custom agents)
Data Federation: query across Postgres, MySQL, MSSQL, SQLite, DuckDB, and Iceberg REST Catalogs from a single project
AI-Assisted Model Builder: a chat-based agent discovers schemas, maps fields, detects enums, and infers relationships
Scoped Query Execution: agents run read-only SQL against sandboxed VIEWs, never raw tables
Token-Based Access Control: MCP tokens with configurable model scopes and expiry
Testing Suite: test cases that validate whether agents can use your semantic models correctly
Self-Hosted: deploy with Docker in minutes, keep your data on your infrastructure

Quick Start

Docker (Standalone)

Run archmax as a single container with embedded MongoDB and Redis:

docker run -d \
  --name archmax \
  -p 8080:8080 \
  -e BETTER_AUTH_SECRET=$(openssl rand -base64 32) \
  -e UI_USERNAME=admin \
  -e UI_PASSWORD=changeme \
  -e AGENT_API_KEY=your-openrouter-api-key \
  -v ~/.archmax:/data \
  ghcr.io/archmaxai/archmax:latest

Volume mount: The -v ~/.archmax:/data bind mount persists all application data on the host — semantic model YAML files (projects/), embedded MongoDB data (mongodb/), and the DuckDB extension cache (.duckdb/). Without this mount, all data is lost when the container is removed.

Save your BETTER_AUTH_SECRET. If you lose this value or change it on a restart, all sessions and authentication data become invalid. Generate it beforehand and store it in a safe place.

AGENT_API_KEY is required for AI features. The Semantic Model Builder, Testing Playground, and automatic title generation all need an API key for an OpenAI-compatible provider. The default endpoint is OpenRouter. Without this key, agent features will be unavailable.

Open http://localhost:8080 and log in with username admin (or your UI_USERNAME) and the password you set in UI_PASSWORD.

Docker Compose (Recommended for Production)

Runs archmax with dedicated MongoDB and Redis containers instead of the embedded services:

# 1. Clone the repo (or copy docker-compose.yml + .env.example)
git clone https://github.com/archmaxai/archmax.git
cd archmax

# 2. Create your .env from the example and fill in the required values
cp .env.example .env

# 3. Start the stack
docker compose up -d

The .env file needs at minimum:

BETTER_AUTH_SECRET=<random-32+-char-secret>   # openssl rand -base64 32
UI_PASSWORD=<your-admin-password>
AGENT_API_KEY=<your-openrouter-api-key>       # required for AI features

Optional overrides (defaults shown):

Variable	Default
`UI_USERNAME`	`admin`
`AGENT_API_BASE_URL`	`https://openrouter.ai/api/v1`
`AGENT_MODEL`	`anthropic/claude-sonnet-4.6`

The stack exposes port 8080 and persists data in two Docker volumes:

Volume	Container Path	Contents
`archmax-data`	`/data`	Semantic model YAML files, DuckDB extension cache
`mongo-data`	`/data/db` (mongo container)	MongoDB database files

These volumes are created automatically by Docker Compose. To use host bind mounts instead, edit docker-compose.yml and replace the named volumes with paths (e.g. ./data/archmax:/data).

Local Development

git clone https://github.com/archmaxai/archmax.git
cd archmax
cp .env.example .env.local   # Edit with your settings
pnpm install
pnpm dev

Service	URL
Frontend	http://localhost:5173
API	http://localhost:3000
Docs	http://localhost:4321
MCP	`POST http://localhost:3000/mcp/<project-slug>/mcp`

Architecture

archmax/
├── apps/
│   ├── api/          # Hono API server
│   ├── e2e/          # Playwright end-to-end tests
│   ├── frontend/     # Vite + React SPA (TanStack Router)
│   ├── worker/       # BullMQ worker for agent jobs
│   └── docs/         # Documentation site (Astro Starlight)
├── packages/
│   ├── core/         # Shared models, services, config (@archmax/core)
│   └── ui/           # React UI components (@archmax/ui)
└── openspec/         # Specifications and change proposals

Tech stack: TypeScript, Hono, React 19, Vite 6, MongoDB, DuckDB, Tailwind CSS 4, Turborepo, Playwright

MCP Tools

Tool	Description
`list_semantic_models`	List semantic models the token has access to
`get_semantic_model`	Overview of a model with datasets, relationships, and metrics
`get_datasets`	Fields for one or more datasets with types, examples, enums, and instructions
`execute_query`	Run a read-only SQL query scoped to a semantic model's VIEWs
`request_improvement`	Submit an improvement request for a semantic model

Connecting an AI Agent

Configure your MCP client with:

Endpoint: https://your-server/mcp/<project-slug>/mcp
Auth: Bearer <your-mcp-token>

{
  "mcpServers": {
    "archmax": {
      "url": "https://your-server/mcp/your-project/mcp",
      "headers": {
        "Authorization": "Bearer sk-your-token"
      }
    }
  }
}

Configuration

Key environment variables (see .env.example for the full list):

Variable	Description
`BETTER_AUTH_SECRET`	Session encryption secret (min 32 chars). Save and reuse across restarts.
`UI_USERNAME` / `UI_PASSWORD`	Initial admin credentials (default username: `admin`)
`APP_BASE_URL`	Public URL of this instance (e.g. `https://archmax.example.com`). Set when behind a reverse proxy to auto-configure CORS and auth.
`ENCRYPTION_KEY`	Optional. Encrypts database connection passwords and API keys at rest (AES-256-GCM). Generate with `openssl rand -base64 32`.
`MONGODB_URI`	MongoDB connection string (optional in Docker; embedded when omitted)
`AGENT_API_BASE_URL`	OpenAI-compatible API endpoint (default: OpenRouter)
`AGENT_API_KEY`	API key for the AI agent (required for agent features)
`AGENT_MODEL`	LLM model identifier (e.g., `anthropic/claude-sonnet-4`)
`REDIS_URL`	Optional. Enables BullMQ worker queue (embedded in Docker when omitted)

Contributing

archmax uses OpenSpec for spec-driven development. Every feature PR must include a corresponding spec change.

Setup

npm install -g openspec-cli

Workflow

OpenSpec integrates with your AI coding assistant. Use these prompts in Cursor (or any OpenSpec-aware agent):

Prompt	What it does
`/openspec-proposal`	Scaffolds a new change proposal with `proposal.md`, `tasks.md`, and spec deltas
`/openspec-apply`	Implements an approved proposal by following the task checklist
`/openspec-archive`	Archives a completed change and updates specs

The typical flow:

Propose — Run /openspec-proposal and describe the change. The agent creates the proposal directory, writes spec deltas, and validates with openspec validate <change-id> --strict.
Review — Get the proposal approved before any code is written.
Implement — Run /openspec-apply to work through the task list.
Archive — After merging, run /openspec-archive to move the change to the archive and update the canonical specs.

You can also drive the workflow manually with the CLI (openspec list, openspec show, openspec validate, openspec archive). See openspec/AGENTS.md for the full reference.

See the Contributing guide for details.

License

AGPL-3.0

Name		Name	Last commit message	Last commit date
Latest commit History 98 Commits
.agents/skills		.agents/skills
.cursor/commands		.cursor/commands
.github		.github
apps		apps
docs/images		docs/images
openspec		openspec
packages		packages
scripts		scripts
.env.example		.env.example
.gitignore		.gitignore
.npmrc		.npmrc
AGENTS.md		AGENTS.md
CNAME		CNAME
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.ci.yml		docker-compose.ci.yml
docker-compose.yml		docker-compose.yml
entrypoint.sh		entrypoint.sh
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
tsconfig.base.json		tsconfig.base.json
turbo.json		turbo.json
vitest.config.ts		vitest.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

archmax

The Problem

How archmax Solves This

Features

Quick Start

Docker (Standalone)

Docker Compose (Recommended for Production)

Local Development

Architecture

MCP Tools

Connecting an AI Agent

Configuration

Contributing

Setup

Workflow

License

About

Uh oh!

Releases 19

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

archmax

The Problem

How archmax Solves This

Features

Quick Start

Docker (Standalone)

Docker Compose (Recommended for Production)

Local Development

Architecture

MCP Tools

Connecting an AI Agent

Configuration

Contributing

Setup

Workflow

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 19

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages