🧠 Eerly AI Insights Engine (v3)

Universal, Multi-Agent Business Intelligence Platform

Eerly AI is a production-grade backend that transforms natural language questions into professional data analysis. Unlike simple Text-to-SQL bots, Eerly acts as a Senior Data Scientist, providing:

📝 Structured Reports (Observations, Insights, Recommendations)
📊 Smart Dashboarding (Auto-generates chart layouts for frontend rendering)
🛡️ Self-Healing SQL (Auto-corrects invalid queries via retry loops, up to 3 attempts)
⚡ Low-Latency Pipeline (Schema caching, async LLM calls, HTTP keep-alive)
🔍 Full Debug Logging (Structured logs at every pipeline stage with elapsed ms)
🐳 Fully Containerised (Runs end-to-end via Docker Compose)

🚀 Key Capabilities

🎓 Senior Analyst Persona

Does not just return data rows — generates a comprehensive report containing Observations, Key Insights, and Strategic Recommendations.

📊 Smart Dashboarding

Automatically determines when a visualisation is needed and generates a JSON layout (Bar, Line, Pie, Scatter) for frontend rendering.

🛡️ Self-Healing Architecture

If the AI generates invalid SQL, the system catches the error, analyses the schema, and self-corrects the query (up to 3 retries). It also features a robust LLM Fallback Mechanism that automatically switches to a high-speed Groq model (Llama 3.3 70B) if the primary Azure endpoint experiences rate limits or network issues.

⚡ Latency Optimisations (v3)

Schema TTL Cache – DB table-info fetched once per 10 minutes, not on every query (~200–800 ms saving per request)
Async LLM Calls – Router, SQL Generator, and Dashboard Designer use ainvoke so FastAPI's event loop is never blocked
HTTP Keep-Alive – Streamlit frontend reuses a persistent connection to the backend

🔍 Structured Debug Logging (v3)

Every pipeline stage emits timestamped [DEBUG] / [INFO] / [ERROR] log lines.
Control verbosity with LOG_LEVEL=DEBUG (development) or LOG_LEVEL=INFO (production) — no code change needed.

🧱 Tech Stack

Layer	Technology
Infrastructure	Docker & Docker Compose
Database	Microsoft SQL Server (Eastman, via Docker) · Aiven PostgreSQL (NK Proteins, ACME Corp) · Aiven MySQL (BIDCO Africa)
Backend	FastAPI + Uvicorn
Frontend	Streamlit
Orchestration	LangGraph
LLM Integration	LangChain (Azure OpenAI Primary, Groq Llama 3.3 70B Fallback)

📂 Project Structure

insights-docker/
├── docker/
│   ├── docker-compose.yml   ← Orchestrates all services
│   ├── Dockerfile           ← Shared image for backend + frontends
│   └── init_db.sh           ← DB init helper
│
├── front-end/
│   ├── app.py               ← General Streamlit frontend (port 8501)
│   ├── app_eastman.py       ← Eastman Executive Dashboard (port 8502)
│   └── utils.py             ← SSE streaming + chart rendering helpers
│
├── core/
│   ├── config.py            ← Loads .env settings
│   ├── database.py          ← DB connection + schema TTL cache
│   ├── llm.py               ← Azure OpenAI LLM instance
│   ├── logger.py            ← Centralised logging factory (v3)
│   ├── prompts.py           ← All system prompts
│   ├── state.py             ← LangGraph state schema
│   └── streaming.py         ← Token stream queue registry
│
├── agents/
│   ├── router.py            ← Intent classifier (async, v3)
│   ├── sql_generator.py     ← SQL generation (async + cached schema, v3)
│   ├── executor.py          ← SQL execution + report synthesis
│   ├── dashboard_designer.py← Dashboard JSON generation (async, v3)
│   └── chat.py              ← General chat responder
│
├── workflows/               ← LangGraph graph definition
├── database_source/         ← Data files & ingestion scripts
├── main.py                  ← FastAPI backend entry point
├── requirements.txt
└── .env                     ← Secrets & config (never commit this)

🏗️ PART 1: One-Time Setup

Prerequisites

Docker Desktop installed and running (enable WSL2 backend on Windows)
Azure OpenAI Endpoint, API Key, and a deployed model (e.g. gpt-5)
Git (to clone the repo)

Step 1 — Clone the Repository

git clone <your-repo-url>
cd insights-docker

Step 2 — Create the `.env` File

A template is provided at .env.example. Copy it and fill in your values:

cp .env.example .env

Key variables:

Variable	Purpose
`AZURE_OPENAI_API_KEY`	Primary LLM — Azure OpenAI key
`AZURE_OPENAI_ENDPOINT`	Azure OpenAI resource URL
`AZURE_OPENAI_DEPLOYMENT`	Model deployment name (e.g. `gpt-5`)
`AZURE_OPENAI_VERSION`	API version (e.g. `2025-01-01-preview`)
`LITELLM_BASE_URL`	(Optional) LiteLLM proxy URL — use instead of direct Azure endpoint
`LITELLM_API_KEY`	(Optional) LiteLLM proxy API key
`GROQ_API_KEY`	Fallback LLM — auto-activated on Azure rate limits
`GEMINI_API_KEY`	Reserved for future use
`AIVEN_SERVICE_URI`	Aiven PostgreSQL — NK Proteins & ACME Corp
`AIVEN_SERVICE_URI_MySQL`	Aiven MySQL — BIDCO Africa
`SQLALCHEMY_URL_EASTMAN`	Eastman Sales DB (SQL Server connection string)

⚠️ Never commit .env to Git. It is already listed in .gitignore.

Step 3 — Build & Start All Containers

All docker compose commands must be run from inside the docker/ subfolder.

cd docker
docker compose up --build -d

This builds one shared Docker image and starts 4 services:

Container	Role	Port
`eerly_sql_server`	Microsoft SQL Server 2022	1434 (host)
`eerly_backend`	FastAPI + LangGraph	8000
`eerly_frontend`	Streamlit (general)	8501
`eerly_frontend_eastman`	Eastman Executive Dashboard	8502

Wait ~30 seconds, then verify all containers are running:

docker compose ps

All services should show Up.

Step 4 — Ingest Eastman Sales Data (For Port 8502)

python database_source/Eastman_Auto/ingestion_eastman_data.py

This populates Eastman_Sales_DB used by the Eastman Executive Dashboard on port 8502.

Step 5 — Seed Aiven PostgreSQL Data (NK Proteins & ACME Corp)

This script drops and recreates the nk_proteins and acme_corp schemas on your Aiven PostgreSQL instance, populated with synthetic ERP data.

cd database_source/Avien
python generate_erp_data.py nk_proteins
python generate_erp_data.py acme_corp

You can verify the connection later using python test_aiven_connection.py.

Step 6 — Seed Aiven MySQL Data (BIDCO Africa)

This script creates and populates the BIDCO Africa tables in your Aiven MySQL instance with synthetic sales and inventory data.

cd database_source/BIDCO_Africa
python update_avien_mysql.py

You can verify the connection and data relationships using python test_avien-mysql-bda-connection.py.

🔄 PART 2: Daily Usage

Starting the App

cd docker
docker compose up -d

Only use --build again if you change requirements.txt or Dockerfile.

Accessing the App

Frontend	URL	Description
General Dashboard	http://localhost:8501	Conversational UI — NK Proteins, ACME Corp, Eastman, BIDCO Africa
Eastman Dashboard	http://localhost:8502	Eastman Executive Dashboard (dedicated view)
Backend API Docs	http://localhost:8000/docs	FastAPI Swagger UI

Stopping the App

docker compose down

Database data persists in a Docker volume. You do not need to restore again unless you run docker compose down -v (which wipes volumes).

🔍 PART 3: Viewing Logs (v3 Feature)

The v3 pipeline emits structured logs at every stage. Open two terminal windows, both pointing to the docker/ folder:

Terminal A — Backend (main pipeline):

cd docker
docker compose logs -f backend

Terminal B — Frontend Eastman:

cd docker
docker compose logs -f frontend_eastman

What to Look For

A successful SQL query produces this sequence in backend logs:

[API] POST /stream – db='Eastman_Sales_DB' | query='Show top 5 states...'
[STREAM][START] stream_id=abc-123 ...
[Router] Decision='DATA_QUERY' | LLM latency=312 ms
[SQLGen] Schema cache HIT for 'Eastman_Sales_DB' (expires in 580s)
[SQLGen] SQL generated in 1843 ms
[Executor] DB query completed in 87 ms
[Executor] Synthesis done – 4231 ms, 48 chunks
[Dashboard] Dashboard spec created – 1 widget(s)
[STREAM][DONE] elapsed=7312 ms | tokens=48 | streamed=True

Schema cache in action:

1st query: Schema cache MISS – fetching from DB … (Xms) ← DB round-trip
2nd query: Schema cache HIT (expires in Xs) ← instant, no DB call ✅

Switching Log Verbosity

Edit docker/docker-compose.yml and change LOG_LEVEL:

- LOG_LEVEL=INFO    # production: only warnings/errors
- LOG_LEVEL=DEBUG   # development: full pipeline trace (default)

Then restart: docker compose up -d --no-deps backend

🧪 Example Queries

The full executive demo script is in c_suite_queries.md.

NK Proteins / ACME Corporation

"What are our top 3 most profitable products based on standard gross margin?"
"Predict our total expected cash inflow over the next 30 days based on open invoices."
"Identify our dead stock: which materials haven't been sold in over 90 days, and what is the total working capital blocked?"

Eastman Sales DB — Query Tiers

Micro (instant fact retrieval):

"Total Sales" · "Sales Trends" · "Top Products" · "Worst Performing Regions"

Operational:

"Show me sales for Fiscal Year 2024."
"Which region has the highest returns?"
"Compare sales between ENEPL and EAPL subsidiaries."

Strategic (tests reasoning + synthesis):

"Analyze the sales performance of the North region vs the South for the last 6 months and tell me which product category is driving the difference."
"Identify the bottom 3 performing states in terms of sales growth and list the top-selling products in those states to see if there is a mismatch."

Multi-turn conversational flow (tests context retention):

Turn	Query
1	"Show me the monthly sales trend for the current fiscal year."
2	"Okay, now filter this for just the North region."
3	"Why is there a dip in November?"
4	"List the top customers who bought during that dip."

Executive Summary (triggers 3-section Observations → Insights → Recommendations format):

"Give me an executive summary of our sales performance in Karnataka."
"Summarize the performance of our Solar Solutions business segment."

BIDCO Africa DB

"What are the top-selling product categories by revenue this quarter?"
"Which regions have the highest inventory turnover rate?"
"Show me a breakdown of sales performance by country."

🔧 Troubleshooting

"no configuration file provided: not found"

You are running docker compose from the wrong directory.
Fix: Always cd into the docker/ subfolder first:

cd insights-docker/docker
docker compose <command>

Database Connection Errors

pyodbc.OperationalError: Login failed for user 'sa'
Check that DB_PASSWORD in docker-compose.yml matches MSSQL_SA_PASSWORD in the same file. Special characters like @ must be URL-encoded as %40 in all SQLALCHEMY_URL_* connection strings.

Container Starts But Backend Crashes

Run docker compose logs backend to see the full error.
Common cause: missing or incorrect .env values (esp. AZURE_OPENAI_API_KEY).

Schema Not Updating After DB Changes

The schema is cached for 10 minutes. To force a refresh, restart the backend:

docker compose restart backend

📋 Changelog

Version	Changes
v3 (Mar 2026)	Schema TTL caching, async LLM agents (router, sql_generator, dashboard_designer), HTTP keep-alive, full structured debug logging across all pipeline stages, `LOG_LEVEL` env var
v2	Streaming SSE responses, Docker multi-service setup, self-healing SQL retries, executive summary prompt
v1	Initial LangGraph pipeline, single database support

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
agents		agents
core		core
database_source		database_source
docker		docker
front-end		front-end
workflows		workflows
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
ReadMe.md		ReadMe.md
c_suite_queries.md		c_suite_queries.md
main.py		main.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

🧠 Eerly AI Insights Engine (v3)

🚀 Key Capabilities

🎓 Senior Analyst Persona

📊 Smart Dashboarding

🛡️ Self-Healing Architecture

⚡ Latency Optimisations (v3)

🔍 Structured Debug Logging (v3)

🧱 Tech Stack

📂 Project Structure

🏗️ PART 1: One-Time Setup

Prerequisites

Step 1 — Clone the Repository

Step 2 — Create the .env File

Step 3 — Build & Start All Containers

Step 4 — Ingest Eastman Sales Data (For Port 8502)

Step 5 — Seed Aiven PostgreSQL Data (NK Proteins & ACME Corp)

Step 6 — Seed Aiven MySQL Data (BIDCO Africa)

🔄 PART 2: Daily Usage

Starting the App

Accessing the App

Stopping the App

🔍 PART 3: Viewing Logs (v3 Feature)

What to Look For

Switching Log Verbosity

🧪 Example Queries

NK Proteins / ACME Corporation

Eastman Sales DB — Query Tiers

BIDCO Africa DB

🔧 Troubleshooting

"no configuration file provided: not found"

Database Connection Errors

Container Starts But Backend Crashes

Schema Not Updating After DB Changes

📋 Changelog

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Step 2 — Create the `.env` File

Packages