Universal, Multi-Agent Business Intelligence Platform
Eerly AI is a production-grade backend that transforms natural language questions into professional data analysis. Unlike simple Text-to-SQL bots, Eerly acts as a Senior Data Scientist, providing:
- 📝 Structured Reports (Observations, Insights, Recommendations)
- 📊 Smart Dashboarding (Auto-generates chart layouts for frontend rendering)
- 🛡️ Self-Healing SQL (Auto-corrects invalid queries via retry loops, up to 3 attempts)
- ⚡ Low-Latency Pipeline (Schema caching, async LLM calls, HTTP keep-alive)
- 🔍 Full Debug Logging (Structured logs at every pipeline stage with elapsed ms)
- 🐳 Fully Containerised (Runs end-to-end via Docker Compose)
Does not just return data rows — generates a comprehensive report containing Observations, Key Insights, and Strategic Recommendations.
Automatically determines when a visualisation is needed and generates a JSON layout (Bar, Line, Pie, Scatter) for frontend rendering.
If the AI generates invalid SQL, the system catches the error, analyses the schema, and self-corrects the query (up to 3 retries). It also features a robust LLM Fallback Mechanism that automatically switches to a high-speed Groq model (Llama 3.3 70B) if the primary Azure endpoint experiences rate limits or network issues.
- Schema TTL Cache – DB table-info fetched once per 10 minutes, not on every query (~200–800 ms saving per request)
- Async LLM Calls – Router, SQL Generator, and Dashboard Designer use
ainvokeso FastAPI's event loop is never blocked - HTTP Keep-Alive – Streamlit frontend reuses a persistent connection to the backend
Every pipeline stage emits timestamped [DEBUG] / [INFO] / [ERROR] log lines.
Control verbosity with LOG_LEVEL=DEBUG (development) or LOG_LEVEL=INFO (production) — no code change needed.
| Layer | Technology |
|---|---|
| Infrastructure | Docker & Docker Compose |
| Database | Microsoft SQL Server (Eastman, via Docker) · Aiven PostgreSQL (NK Proteins, ACME Corp) · Aiven MySQL (BIDCO Africa) |
| Backend | FastAPI + Uvicorn |
| Frontend | Streamlit |
| Orchestration | LangGraph |
| LLM Integration | LangChain (Azure OpenAI Primary, Groq Llama 3.3 70B Fallback) |
insights-docker/
├── docker/
│ ├── docker-compose.yml ← Orchestrates all services
│ ├── Dockerfile ← Shared image for backend + frontends
│ └── init_db.sh ← DB init helper
│
├── front-end/
│ ├── app.py ← General Streamlit frontend (port 8501)
│ ├── app_eastman.py ← Eastman Executive Dashboard (port 8502)
│ └── utils.py ← SSE streaming + chart rendering helpers
│
├── core/
│ ├── config.py ← Loads .env settings
│ ├── database.py ← DB connection + schema TTL cache
│ ├── llm.py ← Azure OpenAI LLM instance
│ ├── logger.py ← Centralised logging factory (v3)
│ ├── prompts.py ← All system prompts
│ ├── state.py ← LangGraph state schema
│ └── streaming.py ← Token stream queue registry
│
├── agents/
│ ├── router.py ← Intent classifier (async, v3)
│ ├── sql_generator.py ← SQL generation (async + cached schema, v3)
│ ├── executor.py ← SQL execution + report synthesis
│ ├── dashboard_designer.py← Dashboard JSON generation (async, v3)
│ └── chat.py ← General chat responder
│
├── workflows/ ← LangGraph graph definition
├── database_source/ ← Data files & ingestion scripts
├── main.py ← FastAPI backend entry point
├── requirements.txt
└── .env ← Secrets & config (never commit this)
- Docker Desktop installed and running (enable WSL2 backend on Windows)
- Azure OpenAI Endpoint, API Key, and a deployed model (e.g.
gpt-5) - Git (to clone the repo)
git clone <your-repo-url>
cd insights-dockerA template is provided at .env.example. Copy it and fill in your values:
cp .env.example .envKey variables:
| Variable | Purpose |
|---|---|
AZURE_OPENAI_API_KEY |
Primary LLM — Azure OpenAI key |
AZURE_OPENAI_ENDPOINT |
Azure OpenAI resource URL |
AZURE_OPENAI_DEPLOYMENT |
Model deployment name (e.g. gpt-5) |
AZURE_OPENAI_VERSION |
API version (e.g. 2025-01-01-preview) |
LITELLM_BASE_URL |
(Optional) LiteLLM proxy URL — use instead of direct Azure endpoint |
LITELLM_API_KEY |
(Optional) LiteLLM proxy API key |
GROQ_API_KEY |
Fallback LLM — auto-activated on Azure rate limits |
GEMINI_API_KEY |
Reserved for future use |
AIVEN_SERVICE_URI |
Aiven PostgreSQL — NK Proteins & ACME Corp |
AIVEN_SERVICE_URI_MySQL |
Aiven MySQL — BIDCO Africa |
SQLALCHEMY_URL_EASTMAN |
Eastman Sales DB (SQL Server connection string) |
⚠️ Never commit.envto Git. It is already listed in.gitignore.
All
docker composecommands must be run from inside thedocker/subfolder.
cd docker
docker compose up --build -dThis builds one shared Docker image and starts 4 services:
| Container | Role | Port |
|---|---|---|
eerly_sql_server |
Microsoft SQL Server 2022 | 1434 (host) |
eerly_backend |
FastAPI + LangGraph | 8000 |
eerly_frontend |
Streamlit (general) | 8501 |
eerly_frontend_eastman |
Eastman Executive Dashboard | 8502 |
Wait ~30 seconds, then verify all containers are running:
docker compose psAll services should show Up.
python database_source/Eastman_Auto/ingestion_eastman_data.pyThis populates Eastman_Sales_DB used by the Eastman Executive Dashboard on port 8502.
This script drops and recreates the nk_proteins and acme_corp schemas on your Aiven PostgreSQL instance, populated with synthetic ERP data.
cd database_source/Avien
python generate_erp_data.py nk_proteins
python generate_erp_data.py acme_corpYou can verify the connection later using python test_aiven_connection.py.
This script creates and populates the BIDCO Africa tables in your Aiven MySQL instance with synthetic sales and inventory data.
cd database_source/BIDCO_Africa
python update_avien_mysql.pyYou can verify the connection and data relationships using python test_avien-mysql-bda-connection.py.
cd docker
docker compose up -dOnly use
--buildagain if you changerequirements.txtorDockerfile.
| Frontend | URL | Description |
|---|---|---|
| General Dashboard | http://localhost:8501 | Conversational UI — NK Proteins, ACME Corp, Eastman, BIDCO Africa |
| Eastman Dashboard | http://localhost:8502 | Eastman Executive Dashboard (dedicated view) |
| Backend API Docs | http://localhost:8000/docs | FastAPI Swagger UI |
docker compose downDatabase data persists in a Docker volume. You do not need to restore again unless you run
docker compose down -v(which wipes volumes).
The v3 pipeline emits structured logs at every stage. Open two terminal windows, both pointing to the docker/ folder:
Terminal A — Backend (main pipeline):
cd docker
docker compose logs -f backendTerminal B — Frontend Eastman:
cd docker
docker compose logs -f frontend_eastmanA successful SQL query produces this sequence in backend logs:
[API] POST /stream – db='Eastman_Sales_DB' | query='Show top 5 states...'
[STREAM][START] stream_id=abc-123 ...
[Router] Decision='DATA_QUERY' | LLM latency=312 ms
[SQLGen] Schema cache HIT for 'Eastman_Sales_DB' (expires in 580s)
[SQLGen] SQL generated in 1843 ms
[Executor] DB query completed in 87 ms
[Executor] Synthesis done – 4231 ms, 48 chunks
[Dashboard] Dashboard spec created – 1 widget(s)
[STREAM][DONE] elapsed=7312 ms | tokens=48 | streamed=True
Schema cache in action:
- 1st query:
Schema cache MISS – fetching from DB … (Xms)← DB round-trip - 2nd query:
Schema cache HIT (expires in Xs)← instant, no DB call ✅
Edit docker/docker-compose.yml and change LOG_LEVEL:
- LOG_LEVEL=INFO # production: only warnings/errors
- LOG_LEVEL=DEBUG # development: full pipeline trace (default)Then restart: docker compose up -d --no-deps backend
The full executive demo script is in
c_suite_queries.md.
- "What are our top 3 most profitable products based on standard gross margin?"
- "Predict our total expected cash inflow over the next 30 days based on open invoices."
- "Identify our dead stock: which materials haven't been sold in over 90 days, and what is the total working capital blocked?"
Micro (instant fact retrieval):
- "Total Sales" · "Sales Trends" · "Top Products" · "Worst Performing Regions"
Operational:
- "Show me sales for Fiscal Year 2024."
- "Which region has the highest returns?"
- "Compare sales between ENEPL and EAPL subsidiaries."
Strategic (tests reasoning + synthesis):
- "Analyze the sales performance of the North region vs the South for the last 6 months and tell me which product category is driving the difference."
- "Identify the bottom 3 performing states in terms of sales growth and list the top-selling products in those states to see if there is a mismatch."
Multi-turn conversational flow (tests context retention):
| Turn | Query |
|---|---|
| 1 | "Show me the monthly sales trend for the current fiscal year." |
| 2 | "Okay, now filter this for just the North region." |
| 3 | "Why is there a dip in November?" |
| 4 | "List the top customers who bought during that dip." |
Executive Summary (triggers 3-section Observations → Insights → Recommendations format):
- "Give me an executive summary of our sales performance in Karnataka."
- "Summarize the performance of our Solar Solutions business segment."
- "What are the top-selling product categories by revenue this quarter?"
- "Which regions have the highest inventory turnover rate?"
- "Show me a breakdown of sales performance by country."
You are running docker compose from the wrong directory.
Fix: Always cd into the docker/ subfolder first:
cd insights-docker/docker
docker compose <command>pyodbc.OperationalError: Login failed for user 'sa'
Check that DB_PASSWORD in docker-compose.yml matches MSSQL_SA_PASSWORD in the same file. Special characters like @ must be URL-encoded as %40 in all SQLALCHEMY_URL_* connection strings.
Run docker compose logs backend to see the full error.
Common cause: missing or incorrect .env values (esp. AZURE_OPENAI_API_KEY).
The schema is cached for 10 minutes. To force a refresh, restart the backend:
docker compose restart backend| Version | Changes |
|---|---|
| v3 (Mar 2026) | Schema TTL caching, async LLM agents (router, sql_generator, dashboard_designer), HTTP keep-alive, full structured debug logging across all pipeline stages, LOG_LEVEL env var |
| v2 | Streaming SSE responses, Docker multi-service setup, self-healing SQL retries, executive summary prompt |
| v1 | Initial LangGraph pipeline, single database support |