Skip to content

Releases: Dhanush-Raj1/FullStack-RAG-Application-Project

Release v1.1.0-preview.1 — Conversational RAG (Feature Preview)

06 Jun 15:35

Choose a tag to compare

This release significantly improves the conversational quality and intelligence of the RAG application. The system now understands the intent behind every query before processing it, maintains conversation memory across turns, and resolves vague follow-up questions before retrieval — resulting in a more natural and accurate chat experience.


What's New

1. Intent-Aware Query Routing

  • Every user query is now classified as CONVERSATIONAL or RETRIEVAL before any pipeline processing
  • Conversational queries (greetings, small talk, capability questions) are answered directly by the LLM — no vector search triggered
  • Eliminates the previous behavior where "Hello" or "What can you do?" returned irrelevant document chunks
  • Uses Gemini Flash (gemini-3.1-flash-lite) as a lightweight classifier, preserving Groq token quota for generation

2. Conversation Memory

  • Sliding window memory stores the last 10 messages (user + assistant) per session
  • Memory is scoped per session ID — each browser tab maintains isolated history
  • History is injected into both conversational and retrieval responses
  • Enables natural follow-up questions: "remember my name?", "what did I ask earlier?" now work correctly

3. Query Rewriting & Coreference Resolution

  • Vague follow-up queries ("what does it mean?", "tell me more", "explain that") are automatically detected and rewritten into self-contained search queries using conversation history before retrieval
  • Regex-based pre-check avoids unnecessary LLM calls for non-coreference queries
  • Ensures the retriever always receives a precise, meaningful query

Improvements

  • added router classifying "what is my name?" and "remember my name?" as CONVERSATIONAL or RETIEVAL to route to the llm or retriever for context
  • generate_answer() system prompt updated to explicitly use conversation history for resolving references in retrieval responses
  • chat() system prompt updated to correctly answer personal/contextual questions ("what is my name?") from history rather than disclaiming memory
  • Query router prompt updated to correctly classify personal memory questions and assistant identity questions as CONVERSATIONAL
  • Both /api/chat/global and /api/chat/session endpoints now fully support memory and query rewriting

Known Future Enhancements

  • Persistent conversation memory across sessions (database-backed)
  • Streaming LLM responses via Server-Sent Events (SSE)
  • Hybrid search (BM25 + Dense Retrieval)
  • HNSW indexing
  • Multi-modal retrieval
  • Citation highlighting in the UI
  • Docker and Kubernetes deployment
  • LangGraph agent workflows
  • Evaluation framework integration (RAGAS)

Live Application

https://rag-frontend-b75n.onrender.com/

Repository

https://github.com/Dhanush-Raj1/FullStack-RAG-Application-Project


Technical Details

  • Tag: v1.1.0-preview.1
  • Branch: main
  • Date: 2026-06-06
  • Release Type: Preview
  • Base Release: v1.0.0-preview.1

Contributors

  • Dhanush Raj

v1.0.0-preview.1 - RAG Application (Initial Preview)

02 Jun 06:50

Choose a tag to compare

Release v1.0.0-preview.1 - RAG Application (Initial Preview)

Overview

This release introduces the first public preview of the Retrieval-Augmented Generation (RAG) application.

The platform enables users to upload and query document collections using natural language. User queries are processed through a retrieval pipeline that identifies relevant context from the knowledge base and generates grounded responses using a Large Language Model (LLM).

Key Features

  • Document ingestion and indexing pipeline
  • Vector search using pgvector
  • Semantic retrieval and reranking
  • LLM-powered answer generation
  • Source-aware responses with retrieved context
  • Modern chat-based user interface
  • Live cloud deployment

What's Included

  • End-to-end document ingestion pipeline
  • Document chunking and preprocessing
  • Embedding generation
  • pgvector-based semantic retrieval
  • Cohere reranking
  • LLM-based answer generation
  • Source retrieval and display
  • React-based frontend interface
  • FastAPI backend services
  • Cloud deployment

Release Status

This release is provided as a Preview Release.

The core functionality is stable and available for testing; however, additional improvements, optimizations, and user experience enhancements are planned for future releases.

Known Future Enhancements

  • User authentication and chat history persistence
  • Hybrid search (BM25 + Dense Retrieval)
  • HNSW indexing
  • Multi-modal retrieval
  • Citation highlighting in the UI

Live Application

The application is publicly available for testing and evaluation:
https://rag-frontend-b75n.onrender.com/

Repository

https://github.com/Dhanush-Raj1/FullStack-RAG-Application-Project

Technical Details

  • Tag: v1.0.0-preview.1
  • Branch: main
  • Date: 2026-06-02
  • Release Type: Preview

Contributors

  • Dhanush Raj