This project is a complete, containerized prototype of a Retrieval-Augmented Generation (RAG) application. It uses an LLM to answer questions based on a custom knowledge base, served via a high-performance FastAPI backend.
-
Retrieval-Augmented Generation (RAG): Provides answers to user questions grounded in a specific set of documents, reducing hallucinations and providing context-aware responses.
-
FastAPI Backend: A modern, asynchronous, and high-performance API to serve the RAG chain.
-
Dockerized Environment: The entire application is containerized with Docker and orchestrated with Docker Compose for easy, consistent, and reproducible setup.
-
Scalable & Deployable: Built with deployment in mind, ready to be pushed to any cloud service that supports containers.
-
Backend: FastAPI
-
ASGI Server: Uvicorn
-
AI Framework: LangChain
-
LLM Provider: OpenAI
-
Vector Store: ChromaDB
-
Containerization: Docker & Docker Compose
Follow these instructions to get the project up and running on your local machine.
-
Docker and Docker Compose installed on your system.
-
An OpenAI API Key.
-
Clone the repository:
git clone <your-repository-url> cd <your-project-directory> -
Create the environment file:
Create a file named
.envin the root of the project directory and add your OpenAI API key:OPENAI_API_KEY=sk-YourSecretApiKeyHere API_KEY=secretkey -
Add your data:
Place the PDF or text files you want to use as your knowledge base inside the /data directory.
-
Build the knowledge base: Run the ingestion script. This will process your documents and create the local vector store in a
./chroma_dbdirectory.docker-compose run --rm --build rag-app python ingest.pyNote: The
--rmflag automatically removes the container after the script finishes. -
Run the application: Start the FastAPI application using Docker Compose.
docker-compose up -dThe
-dflag runs the container in detached mode. The application will be available athttp://localhost:8000.
You can interact with the API through its documentation, which is automatically generated by FastAPI.
-
Interactive Docs (Swagger):
http://localhost:8000/docs -
Alternative Docs (ReDoc):
http://localhost:8000/redoc
Here is an example of how to send a query to the /generate endpoint from your terminal:
curl -X 'POST' \
'http://localhost:8000/generate' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-H 'x-api-key: secretkey' \
-d '{
"query": "What is the best way to handle path parameters in FastAPI?"
}'
This application has been successfully deployed using Azure Container Apps and is live at the following URL:
➡️ Live Application URL: [RAG-APP]
Use the RAG app by clicking on the generate endpoint and then the "try it out" button. Then enter your question and the "x-api-key" which is just "secretkey"
The data that has been ingested are the following papers:
- Attention Is All You Need
- Language Models are Few-Shot Learners
- Denoising Diffusion Probabilistic Models
- High-Resolution Image Synthesis with Latent Diffusion Models
- Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
- Neural Networks are Decision Trees
- Segment Anything
/
|-- .env # Environment variables (OpenAI API Key)
|-- .gitignore # Files to ignore for git
|-- Dockerfile # Instructions to build the application container
|-- docker-compose.yml # Defines the services for local development
|-- ingest.py # Script to process data and build the vector store
|-- requirements.txt # Python dependencies
|-- README.md # This file
|
|-- /app/ # Main application source code
| |-- main.py # FastAPI application and endpoints
| |-- rag_logic.py # RAG chain creation and logic
|
|-- /data/ # Source documents for the knowledge base
| |-- /chroma_db/ # Persisted ChromaDB vector store (created by ingest.py)
| |-- README.md