Building production-grade GenAI applications with focus on LLM integration, agentic workflows, real-time systems, and MCP-based architecture.
Backend engineer with 3+ years of experience building scalable systems and extensively working with production-grade Generative AI applications.
- Strong in Python backend + LLM infrastructure (RAG, agents, MCP, streaming, real-time systems)
- Building complex GenAI systems that handle real constraints (latency, rate limiting, state management)
- Focused on performance, scalability, and production-ready code
- Experience with Claude API, LangChain, LangGraph, vector databases, and MCP frameworks
- Agentic Workflows → multi-step reasoning, tool orchestration, dynamic decision-making
- RAG Systems → semantic search, document intelligence, retrieval optimization
- LLM-powered APIs → Claude integration, streaming responses, cost optimization
- Real-time AI Systems → low-latency decision making, event-driven architectures
- MCP Servers → structured tool definitions, API integration, scalable tool abstractions
Python • Java • SQL
FastAPI • Flask • Django • REST API Design • Async/Await • Event-driven Architecture
- LLM Frameworks: Claude API, LangChain, LangGraph
- Retrieval & Search: RAG pipelines, vector databases, semantic search, hybrid search
- Agent Architecture: ReAct pattern, tool orchestration, state management, reasoning loops
- Infrastructure: Model Context Protocol (MCP), streaming, prompt engineering
- Fine-tuning & Optimization: Prompt optimization, cost reduction, latency tuning
AWS (Lambda, API Gateway, S3, Bedrock, DynamoDB, CloudWatch, ECS Fargate, SQS) • Docker • PostgreSQL with Vector Extensions
PostgreSQL • MongoDB • DynamoDB • Vector Databases (Pinecone, Weaviate concepts)
Problem: Make trading decisions in <30 seconds using real-time market data under API rate limits.
Claude-powered agent for SENSEX/Nifty 50 options using MCP. Combines price action reasoning with real-time data orchestration.
Implementation highlights:
- MCP Server Design: Built structured tool definitions for market snapshots, candle analysis, options pricing, technical indicators
- LLM Integration: Claude reasoning loop for entry/exit decisions using price action analysis (not lagging indicators)
- API Constraint Handling: Solved Angel One rate limiting (50 req/min during live hours) by migrating to Dhan API (25 req/sec, 250/min, 5000/day)
- Real-time Architecture: Async event-driven backend with <500ms response times for decision-making
- State Management: Persistent trading rules and decision history for pattern recognition across trades
- Production Rules: Trading rulebook with bounce trap detection, double-bottom retest patterns, OI-based strike selection
Why this matters:
Shows you can build LLM systems under real constraints: time pressure, API limits, state persistence, and production-grade decision logic.
Repository: https://github.com/SomuTech/realtime-options-agent
Problem: Semantic search and Q&A over large document collections without full document re-reads.
End-to-end RAG system for intelligent document search, answer generation, and document intelligence.
Implementation highlights:
- RAG Pipeline: Document chunking, embedding generation, vector storage, semantic retrieval, LLM generation
- Retrieval Optimization: Hybrid search (semantic + keyword matching) for precision improvement
- Performance Tuning: Chunk size and overlap optimization reduced hallucination rate and improved F1 score
- Backend Scalability: FastAPI service with PostgreSQL for metadata, vector indexing for semantic search
- Production Features: Caching layer, concurrent request handling, error recovery, response streaming
Why this matters:
Shows you understand modern RAG patterns, retrieval engineering, and production LLM application architecture.
Repository: https://github.com/SomuTech/genai-experiments/tree/main/smartdoc_assistant_RAG
Problem: Automate complex research tasks requiring multi-step reasoning, tool usage, and iterative refinement.
Multi-agent system for research automation using agentic workflows and tool orchestration.
Implementation highlights:
- Agentic Loop: Planning → tool execution → reflection → iteration using LangGraph
- Tool Orchestration: Dynamic tool selection and chaining for research workflows
- State Management: Persistent reasoning state, memory management for long-running workflows
- Extensibility: Plugin architecture for adding new tools and agents without code changes
- Error Handling: Graceful failure recovery, fallback strategies, timeout management
Why this matters:
Shows you can build complex agentic systems with robust state management and production-grade error handling.
Repository: https://github.com/SomuTech/genai-experiments/tree/main/deep_research_AGENT
- Agentic AI architectures and ReAct patterns
- LLM infrastructure, scaling, and cost optimization
- Real-time constraint systems with LLM reasoning
- MCP-based tool abstractions and integrations
- RAG systems and retrieval optimization
- Production GenAI application design
- Generative AI Engineer roles (LLM infrastructure, agents, RAG)
- AI Systems Engineering roles (agentic workflows, real-time systems)
- Backend Engineer roles at AI-focused companies
Interested in teams working on hard problems at the intersection of LLMs, systems design, and real-time constraints.

