A structured 24-week program for software professionals who already know Python/Git.
Philosophy: 15% theory, 85% hands-on. Each week is designed to be completed in one week (2-3 hours/day). Every step is explicit enough to follow along without prior AI experience.
Your first AI apps. By Week 6 you'll have a deployed chat app with streaming.
Topic: Building REST APIs with FastAPI (foundation for all AI apps) Project: Personal Expense Tracker API Key Skills: REST endpoints, Pydantic validation, JSON persistence Theory: 1 hour | Build: 7+ hours
- Build CRUD endpoints from scratch
- Learn request validation with Pydantic
- Implement file-based persistence
- Use Swagger UI for testing
- Handle errors and edge cases
Topic: Talking to LLMs through their APIs Project: Movie Recommendation Chatbot (CLI) Key Skills: API authentication, conversation history, token tracking Theory: 1 hour | Build: 7+ hours
- Make your first OpenAI/Anthropic API call
- Maintain conversation history
- Track tokens and estimate costs
- Save/load conversations
- Experiment with temperature effects
Topic: Writing prompts that produce reliable, consistent outputs Project: Email Writer (tone & audience adaptation) Key Skills: Few-shot learning, chain-of-thought, prompt comparison Theory: 1 hour | Build: 7+ hours
- Write specific, actionable prompts
- Use examples to improve quality
- Chain-of-thought reasoning
- Compare multiple prompt strategies
- Build a rewrite tool with tone/audience control
Topic: Getting reliable JSON from LLMs Project: Job Posting Parser Key Skills: Pydantic validation, Instructor library, structured generation Theory: 1 hour | Build: 7+ hours
- Define Pydantic models for data validation
- Use OpenAI JSON mode
- Automatic retries with Instructor
- Handle missing fields gracefully
- Batch process documents to CSV
Topic: Function/tool calling (the LLM decides which function to call) Project: Personal Assistant with Real Tools (stocks, weather, notes) Key Skills: Tool-calling loop, parallel tool execution, logging Theory: 1 hour | Build: 7+ hours
- Define tools as JSON schemas
- Implement the tool-calling loop
- Handle multi-tool queries
- Add error handling and logging
- Understand how ChatGPT plugins work
Topic: Streaming responses + building a chat UI with Streamlit Project: Full AI Chat App (deployed to Streamlit Cloud) Key Skills: Streaming APIs, Streamlit components, cloud deployment Theory: 1 hour | Build: 7+ hours
- Implement streaming responses (tokens arrive live)
- Build chat interface with Streamlit
- Add controls: model selector, temperature slider, system prompt editor
- Track token usage and costs
- Deploy to Streamlit Cloud (free)
The #1 skill companies hire for. Build systems that let LLMs answer from YOUR documents.
Topic: Convert text to vectors, search by meaning Project: Semantic Search Engine for Notes Key Skills: Embeddings, cosine similarity, t-SNE visualization Theory: 1 hour | Build: 7+ hours
- Understand embeddings conceptually and mathematically
- Implement cosine similarity from scratch
- Build semantic search engine
- Visualize embeddings with t-SNE
- Compare model speed vs quality tradeoffs
Topic: RAG fundamentals (Retrieval-Augmented Generation) Project: PDF Q&A System with Citations Key Skills: Document chunking, ChromaDB, retrieval, grounding LLM answers Theory: 1 hour | Build: 7+ hours
- Extract and chunk PDFs intelligently
- Store vectors in ChromaDB
- Retrieve relevant chunks for queries
- Feed context to LLM with citations
- Build Streamlit UI and test failures
Topic: Build RAG the industry-standard way Project: Multi-Document Q&A with Source Citations Key Skills: LangChain document loaders, text splitters, retrieval chains Theory: 1 hour | Build: 7+ hours
- Load PDFs, markdown, CSV with LangChain loaders
- Use RecursiveCharacterTextSplitter
- Build a RetrievalQA chain
- Return source documents with answers
- Handle multiple document types in one pipeline
Topic: Fix RAG failures — reranking, query rewriting, evaluation Project: Evaluate and Improve Your RAG System Key Skills: RAGAS metrics, reranking, query decomposition, hallucination prevention Theory: 1 hour | Build: 7+ hours
- Create a golden test set (20 question-answer pairs)
- Run automated evaluation with RAGAS
- Add reranking with cross-encoders
- Implement query decomposition for multi-hop questions
- Measure before/after improvement
Topic: Deploy a production-ready RAG system Project: Full RAG App with FastAPI + Qdrant + Docker Key Skills: Qdrant, FastAPI backend, incremental indexing, Docker Compose Theory: 1 hour | Build: 7+ hours
- Switch from ChromaDB to Qdrant (production vector DB)
- Build FastAPI backend with /ask and /upload endpoints
- Add incremental document processing (only re-embed changed files)
- Dockerize everything with docker-compose
- One command starts the full stack
Make LLMs that don't just answer — they take action.
Topic: Build an agent using just the API — no frameworks Project: ReAct Agent with 3 Tools Key Skills: ReAct pattern, tool execution loop, iteration limits Theory: 1 hour | Build: 7+ hours
- Implement the ReAct loop (Reason → Act → Observe)
- Give agent 3 tools (search, calculate, read file)
- Handle tool execution errors
- Add max iteration limits (prevent infinite loops)
- Log every step for debugging
Topic: Build stateful agents with LangGraph Project: Web Research Agent Key Skills: State graphs, conditional edges, tool integration, Tavily search Theory: 1 hour | Build: 7+ hours
- Define agent state with TypedDict
- Build a state graph with conditional edges
- Integrate Tavily search API
- Add human-in-the-loop checkpoints
- Handle failures gracefully
Topic: Chains and workflows — when you DON'T need agents Project: Content Repurposing Pipeline Key Skills: Prompt chaining, parallelization, routing, orchestration Theory: 1 hour | Build: 7+ hours
- Build a 3-step pipeline: extract facts → generate content → score quality
- Run parallel LLM calls (tweet + LinkedIn + summary at once)
- Add routing: classify input, send to specialized handler
- Compare workflow vs agent: speed, cost, reliability
- Learn when NOT to use agents
Topic: Multiple agents collaborating on a task Project: Blog Writing Crew (Researcher → Writer → Editor) Key Skills: CrewAI, agent roles, task delegation, inter-agent communication Theory: 1 hour | Build: 7+ hours
- Define specialized agents with roles and backstories
- Create tasks with dependencies
- Watch agents collaborate in verbose mode
- Track cost per agent
- Compare output quality: single agent vs crew
Topic: Systematically test your AI systems Project: Test Suite for Your RAG and Agents Key Skills: DeepEval, golden test sets, LLM-as-judge, CI integration Theory: 1 hour | Build: 7+ hours
- Create test cases for RAG (answerable, unanswerable, adversarial)
- Create test cases for agents (correct tool selection, refusal, timeouts)
- Build an evaluation harness that scores automatically
- Set up GitHub Actions to run evals on every push
- Track quality over time
Make your AI apps survive real users, real traffic, and real failures.
Topic: Containerize and deploy your AI apps Project: Dockerize Your RAG App Key Skills: Dockerfiles, Docker Compose, multi-service apps, health checks Theory: 1 hour | Build: 7+ hours
- Write Dockerfiles for FastAPI and Streamlit
- Create docker-compose.yml with backend + frontend + vector DB
- Add .dockerignore and environment variable management
- Add health check endpoints
docker compose upstarts everything
Topic: API authentication, rate limiting, prompt injection defense Project: Secure Your Deployed RAG App Key Skills: JWT tokens, API keys, rate limiting, input validation Theory: 1 hour | Build: 7+ hours
- Add API key authentication to endpoints
- Implement rate limiting (10 requests/minute per key)
- Add prompt injection defenses
- Test attacks against your app
- Add request logging and CORS
Topic: Trace every LLM call, log structured data Project: Add Full Observability with Langfuse Key Skills: Langfuse tracing, structlog, metrics, alerting Theory: 1 hour | Build: 7+ hours
- Add structured JSON logging with structlog
- Instrument RAG pipeline with Langfuse traces
- Track latency, tokens, cost per request
- Build a simple monitoring dashboard
- Add alerts for error rate spikes
Topic: Make AI apps affordable at scale Project: Add Redis Caching + Cost Tracking Key Skills: Redis, response caching, model routing, spend limits Theory: 1 hour | Build: 7+ hours
- Add Redis to docker-compose
- Implement exact-match response caching with TTL
- Build a cost tracking endpoint
- Route easy questions to cheap models, hard ones to expensive models
- Measure real savings: 100 queries with/without cache
Open-source models, fine-tuning, and your capstone.
Topic: Run LLMs locally with zero API cost Project: Run Llama/Mistral Locally + Swap Into Your RAG App Key Skills: Ollama, model comparison, local inference Theory: 1 hour | Build: 7+ hours
- Install Ollama, pull and chat with Llama 3.1
- Use from Python — swap into your RAG app (one line change)
- Benchmark 4 models: accuracy, speed, quality
- Create a comparison table in README
- Understand when local vs API makes sense
Topic: Production-speed model serving Project: High-Performance Model Server + Benchmarks Key Skills: vLLM, OpenAI-compatible serving, concurrent benchmarking, quantization Theory: 1 hour | Build: 7+ hours
- Start vLLM as an OpenAI-compatible server
- Benchmark: 50 concurrent requests, measure p50/p95/p99 latency
- Compare Ollama vs vLLM throughput
- Try quantized models (AWQ) — speed vs quality tradeoff
- Add vLLM to your docker-compose stack
Topic: Teach a model your specific task Project: Fine-Tune a Text-to-SQL Model with Unsloth Key Skills: Dataset preparation, LoRA/QLoRA, training, evaluation, GGUF export Theory: 1 hour | Build: 7+ hours
- Prepare 200+ training examples (natural language → SQL)
- Fine-tune Llama 3.2 3B with QLoRA on Google Colab
- Evaluate: base model vs fine-tuned on held-out test set
- Export to GGUF, load in Ollama
- Your custom model runs locally for free
Topic: End-to-end AI product using everything you've learned Project: Pick one and build it from scratch Key Skills: System design, architecture, full-stack AI development Theory: None. Build.
Option A — AI Code Documentation Generator: Point at a repo, auto-generate docs, searchable Q&A Option B — AI Data Analysis Assistant: Upload CSV, ask questions in English, get charts + reports Option C — Domain Knowledge Base: RAG + fine-tuned model + evaluation + monitoring for your domain
Requirements: Docker Compose, architecture diagram, eval results, monitoring, CI/CD, cost analysis.
After 24 weeks, you'll be able to:
- Build LLM apps with APIs, structured outputs, and tool calling
- Build RAG systems that answer from your documents with citations
- Build AI agents that take multi-step actions autonomously
- Evaluate AI systems with automated test suites and quality metrics
- Deploy everything with Docker, monitoring, auth, and caching
- Run open-source models locally and serve them at production speed
- Fine-tune models for your specific use case
- Ship a complete AI product end-to-end
| Category | Tools |
|---|---|
| Backend | FastAPI, uvicorn |
| LLM APIs | OpenAI, Anthropic (Claude) |
| Validation | Pydantic, Instructor |
| Web UI | Streamlit |
| Embeddings | sentence-transformers |
| Vector DBs | ChromaDB (dev), Qdrant (prod) |
| RAG Frameworks | LangChain, LlamaIndex |
| Agents | LangGraph, CrewAI |
| Evaluation | RAGAS, DeepEval |
| Local Models | Ollama, vLLM |
| Fine-Tuning | Unsloth, TRL |
| Deployment | Docker, Docker Compose |
| Monitoring | Langfuse, structlog |
| Caching | Redis |
| CI/CD | GitHub Actions |
- Solid Python knowledge (classes, functions, async is helpful)
- Git basics (clone, commit, push)
- Terminal/CLI comfort
- ~2-3 hours per day for 24 weeks
AI Engineer Roadmap/
├── README.md
├── week-01/idea.md FastAPI Crash Course
├── week-02/idea.md First LLM API Call
├── week-03/idea.md Prompt Engineering
├── week-04/idea.md Structured Outputs
├── week-05/idea.md Tool Calling
├── week-06/idea.md Streaming + Web UI
├── week-07/idea.md Embeddings + Semantic Search
├── week-08/idea.md Chunking + Vector DB (RAG)
├── week-09/idea.md RAG with LangChain
├── week-10/idea.md Advanced RAG + Evaluation
├── week-11/idea.md Production RAG App
├── week-12/idea.md Agent Loops from Scratch
├── week-13/idea.md LangGraph Agents
├── week-14/idea.md Multi-Step Workflows
├── week-15/idea.md Multi-Agent Systems (CrewAI)
├── week-16/idea.md Evaluation + Testing
├── week-17/idea.md Docker + Deployment
├── week-18/idea.md Auth + Security
├── week-19/idea.md Monitoring + Observability
├── week-20/idea.md Cost Control + Caching
├── week-21/idea.md Open Source Models (Ollama)
├── week-22/idea.md Model Serving (vLLM)
├── week-23/idea.md Fine-Tuning (Unsloth)
└── week-24/idea.md Capstone Project
Each idea.md contains: theory links, library setup, step-by-step project, working code snippets, common mistakes, and GitHub push guide.
- Clone this repo
- Start with Week 1 — read the theory, follow the project steps
- Type the code (don't just copy-paste)
- Push each week's project to GitHub
- Move forward — each week builds on previous ones
- Start applying for jobs after Week 16
24 weeks. 24 projects. Build something every week. That's it.