AI/ML Engineer (LLM Apps + MLOps) | Python | FastAPI | Airflow | Vector Search
I build production-focused AI systems: LLM agents, RAG + hybrid search, multilingual data pipelines, and deployable ML services with strong attention to latency, cost, and reliability.
- π§ LLM Apps: LangGraph / LangChain, RAG, tracing/observability
- βοΈ Backend: FastAPI, async, WebSockets, worker pools
- ποΈ Search/Data: Vector DBs (e.g., Weaviate/Chroma), SQL/NoSQL, pipelines (Airflow)
- π MLOps: Docker, CI/CD (GitHub Actions), monitoring (Prometheus), AWS
- Built a multilingual document-filtering pipeline (batching + caching + async) to replace a slower LLM-heavy workflow.
- Reduced report generation runtime by optimizing pipeline stages and cutting unnecessary external API calls.
- Worked on hybrid search features (vector + keyword) and implemented boolean/phrase search for diverse unstructured sources.
- Built LLM agent workflows and automated report generation (PDF/DOC), with tracing for debugging and performance.
- Fine-tuned open-source LLMs for domain tasks using PEFT (LoRA/QLoRA) and built training/eval scripts.
- Improved inference latency using post-training quantization and validated quality vs baseline.
- Deployed containerized inference services (FastAPI + Docker) and integrated CI/CD with GitHub Actions.
A medical documentation assistant prototype:
- Speech-to-text (Whisper) + structured note generation
- DICOM/X-ray report generation (prototype workflow)
- FastAPI backend + React frontend + Neo4j graph relationships
Stack: FastAPI, React, Neo4j, Docker, Whisper
Note: This is a personal prototype. Iβm happy to share architecture details, benchmarks, and demos during interviews.
End-to-end pipeline to detect malicious URLs:
- Airflow DAGs for pipeline orchestration
- MLflow experiment tracking + model versioning
- Dockerized service + AWS deployment
Stack: Random Forest / XGBoost, MongoDB, Airflow, MLflow, Docker, AWS
RAG-based pricing analysis over a large product catalog:
- Vector search with sentence embeddings + ChromaDB
- Agentic analysis workflow
Stack: RAG, Vector DB, Sentence Transformers, LangChain
Explored a transformer attention variant that injects a learnable locality bias to reduce early training noise and improve stability.
Languages: Python, SQL, JavaScript
ML/DL: PyTorch, TensorFlow, Scikit-learn, Hugging Face, XGBoost
LLM Systems: LangGraph, LangChain, RAG, evaluation + tracing
Data/MLOps: Airflow, Docker, GitHub Actions, MLflow, Prometheus
Databases: PostgreSQL, MongoDB, Neo4j, ChromaDB (and other vector DBs)
- Microsoft Certified: Azure AI Engineer Associate (AI-102)
- Email: amanagnihotri902@gmail.com
- LinkedIn: https://www.linkedin.com/in/aman-agnihotri004/
- GitHub: https://github.com/its-amann
- Production-grade RAG evaluation + retrieval testing
- Faster inference + caching strategies for LLM apps
- Better data quality + pipeline reliability