Skip to content

coder-msk/Voice-AI-Agent-Using-Qdrant-Vector-Store-Knowledge-Base

Repository files navigation

Voice AI Agent with Qdrant Vector Store (n8n)

Real-time voice → speech-to-text → RAG agent → JSON response automation built in n8n, with a separate knowledge-ingestion path into Qdrant. A small web UI records audio, calls your n8n webhook, and displays the assistant reply.

Repository: github.com/coder-msk/Voice-AI-Agent-Using-Qdrant-Vector-Store-Knowledge-Base

What this workflow does

  1. Knowledge base (ingestion)
    A Form Trigger accepts file uploads. Documents are split, embedded with OpenAI Embeddings, and inserted into a Qdrant collection (for example qdrant_database).

  2. Voice Q&A (runtime)
    A Webhook receives an audio recording, Groq (whisper-large-v3) transcribes it, an AI Agent (OpenAI chat model) answers using a Qdrant Vector Store node in retrieve-as-tool mode, and Respond to Webhook returns { "response": "<agent output>" } to your frontend.

The bundled workflow JSON is: Voice AI Agent Using Qdrant Vectore Store Knowledge Base.json — import it into n8n and reconnect credentials.

Security: Before pushing or sharing, replace placeholder API values in the workflow with n8n credentials or environment variables. If this project was ever shared with a real Groq or OpenAI key in the JSON, rotate those keys in the provider dashboards.

Requirements

  • n8n (Cloud or self-hosted)
  • Qdrant instance and API credentials
  • OpenAI API key (chat model + embeddings)
  • Groq API key (speech-to-text HTTP node)
  • Optional: a static frontend (e.g. served from localhost) that POSTs audio to the webhook path configured in Receive Voice Recording

Screenshots

Each image below is used once and shows a different part of the system (UI vs. n8n execution panels).

Voice web UI — recording complete and on-screen reply

The client shows recording status and the text returned from the automation.

Voice web UI with AI response

n8n — voice pipeline on the canvas with execution history

Webhook → Groq transcription → AI Agent (OpenAI) → response to the caller; executions listed in the sidebar.

n8n workflow canvas and execution history

n8n — AI Agent node output (example run)

Output panel for the AI Agent step after a successful execution (example assistant text in the logs).

AI Agent node output example

n8n — final webhook JSON returned to the frontend

Send Response to Webhook shapes the payload (e.g. response field) your UI consumes.

Respond to Webhook JSON response

n8n — voice pipeline with agent step timing and tokens

Execution detail for the agent leg of the pipeline (duration and token usage in the logs).

Voice pipeline AI Agent execution detail

Import and configuration (short checklist)

  1. In n8n: Workflows → Import from File and select the JSON workflow.
  2. Create credentials for Qdrant, OpenAI, and set the Groq Authorization: Bearer … header on Convert Speech to Text to a secure value (prefer n8n credentials or expressions, not a committed secret).
  3. Activate the workflow; open the Form Trigger URL to ingest documents and the Webhook URL for voice requests.
  4. Point your frontend at the production webhook URL and ensure the binary field name matches what Convert Speech to Text expects (e.g. audio).

License

Add a LICENSE file in this repository if you want to specify terms for reuse.

About

Voice AI agent built with n8n that converts speech to text, processes queries using an LLM, and retrieves context from a Qdrant vector database for accurate, knowledge-grounded responses. Enables real-time voice-based RAG systems and intelligent automation workflows.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors