CorpRisk-AI is a cloud-native, multi-agent AI system designed to automate corporate banking due diligence and risk assessment. Built using LangGraph and LangChain, the system orchestrates multiple AI agents to retrieve company financial records, analyze compliance against Anti-Money Laundering (AML) policies, and synthesize structured risk reports.
This project was developed to demonstrate enterprise-grade AI engineering, specifically focusing on agentic workflows, RAG (Retrieval-Augmented Generation) architectures, and LLM observability for financial services.
The application uses a stateful graph architecture (LangGraph) containing three distinct reasoning agents, backed by a PostgreSQL data quality layer:
- Document Retriever Agent (RAG): Connects to a Vector Database (ChromaDB/FAISS) to fetch semantically relevant financial documents, unstructured data, and mock AML policy PDFs based on the target company.
- Compliance & Risk Agent: Evaluates the retrieved context against established banking guardrails. Identifies high-risk indicators, compliance breaches, or missing financial data.
- Report Synthesizer Agent: Aggregates findings from the previous agents into a structured JSON payload, providing a final decision (e.g.,
APPROVED,MANUAL_REVIEW,REJECTED). - Data Quality Layer (PostgreSQL): Every assessment is logged to a relational database with automated validation checks β completeness, consistency, and timeliness monitoring.
graph TD;
A[Client Request: Company Name] --> B[FastAPI Backend]
B --> C{LangGraph Orchestrator}
C -->|State: Query| D[Retriever Agent + Vector DB]
D -->|State: Context| E[Compliance Agent]
E -->|State: Risk Flags| F[Synthesizer Agent]
F -->|State: JSON Report| B
C -.->|Telemetry| G[(LangSmith Observability)]
(Note: GitHub natively renders Mermaid diagrams. The above will display as a beautiful flowchart in your repo!)
- Multi-Agent Orchestration: Utilizes LangGraph for stateful, cyclic, and conditional agent execution.
- Enterprise RAG Pipeline: Employs optimized document chunking and vector embeddings for precise context retrieval.
- LLM Observability & Evaluation: Integrated deeply with LangSmith to monitor token usage, track execution latency, and evaluate prompt performance and AI safety guardrails.
- Cloud-Native & Scalable: Wrapped in a FastAPI application and containerized via Docker, making it ready for deployment on Azure Kubernetes Service (AKS) or Azure Container Apps.
- Azure Ecosystem Ready: Designed to seamlessly swap standard OpenAI endpoints with secure Azure OpenAI enterprise endpoints.
- Core AI: Python, LangChain, LangGraph, OpenAI / Azure OpenAI
- Data & RAG: ChromaDB / FAISS (Vector Store), PyPDFLoader
- Data Quality & Storage: PostgreSQL, SQLAlchemy
- Backend: FastAPI, Uvicorn, Pydantic
- DevOps & Observability: Docker, LangSmith
- Domain: FinTech, Banking Compliance, Risk Analysis
- Python 3.10+
- PostgreSQL 17+ (local install via
brew install postgresql@17) - Docker (Optional, for containerized run)
- OpenAI API Key (or Azure OpenAI credentials)
- LangSmith API Key (for observability)
git clone https://github.com/NeelM47/CorpRisk-AI.git
cd CorpRisk-AICreate a .env file in the root directory:
OPENAI_API_KEY=your_openai_key
LANGCHAIN_TRACING_V2=true
LANGCHAIN_ENDPOINT="https://api.smith.langchain.com"
LANGCHAIN_API_KEY=your_langsmith_key
LANGCHAIN_PROJECT="CorpRisk-DueDiligence"uv venv
source .venv/bin/activate
uv synccreatedb corp_risk_db
# Tables are created automatically on first startupRun the ingestion script to populate the Vector DB with mock corporate financial reports and AML policies.
python src/ingest_data.pyVia Python:
uv run uvicorn src.main:app --reload --host 0.0.0.0 --port 8000Via Docker:
docker build -t corprisk-ai .
docker run -p 8000:8000 --env-file .env corprisk-aiOnce the server is running, access the automatic Swagger documentation at http://localhost:8000/docs.
Endpoint: POST /api/v1/assess-company
Request Payload:
{
"company_name": "TechCorp Innovations Ltd",
"assessment_type": "standard_due_diligence"
}Response Payload:
{
"company_name": "TechCorp Innovations Ltd",
"status": "MANUAL_REVIEW",
"retrieved_documents": 4,
"risk_flags":[
"Incomplete UBO (Ultimate Beneficial Owner) documentation for Q3.",
"Unusual cross-border transaction volume detected in mock financial summary."
],
"summary": "TechCorp Innovations Ltd shows healthy revenue streams, but flagged transactions require secondary compliance review according to AML Policy section 3.2."
}| Endpoint | Method | Description |
|---|---|---|
/api/v1/quality/metrics |
GET | Returns latest 50 quality check results |
/api/v1/quality/run |
POST | Triggers data quality validation checks manually |
Quality checks run automatically on each assessment and track:
- Completeness: null/missing field rates per table
- Consistency: duplicate records and orphaned references
- Timeliness: records not assessed within 90 days
To ensure AI safety and reliable performance, this system relies on LangSmith. Every API call generates a trace that tracks:
- Agent Trajectories: Step-by-step reasoning logs of the Compliance Agent.
- Retrieval Effectiveness: What exact chunks were pulled from the Vector DB.
- Cost & Latency: Token usage metrics for cost optimization.
(Feel free to check the assets/ folder for screenshots of the LangSmith dashboard tracking this application).
- Migrate Vector Store from local ChromaDB to Azure AI Search.
- Implement robust CI/CD pipelines via Azure DevOps.
- Add Human-in-the-Loop (HITL) approval nodes in the LangGraph workflow.
Designed and developed by Neel More as a demonstration of scalable AI engineering in the financial sector.