Real-time voice → speech-to-text → RAG agent → JSON response automation built in n8n, with a separate knowledge-ingestion path into Qdrant. A small web UI records audio, calls your n8n webhook, and displays the assistant reply.
Repository: github.com/coder-msk/Voice-AI-Agent-Using-Qdrant-Vector-Store-Knowledge-Base
-
Knowledge base (ingestion)
A Form Trigger accepts file uploads. Documents are split, embedded with OpenAI Embeddings, and inserted into a Qdrant collection (for exampleqdrant_database). -
Voice Q&A (runtime)
A Webhook receives an audio recording, Groq (whisper-large-v3) transcribes it, an AI Agent (OpenAI chat model) answers using a Qdrant Vector Store node in retrieve-as-tool mode, and Respond to Webhook returns{ "response": "<agent output>" }to your frontend.
The bundled workflow JSON is: Voice AI Agent Using Qdrant Vectore Store Knowledge Base.json — import it into n8n and reconnect credentials.
Security: Before pushing or sharing, replace placeholder API values in the workflow with n8n credentials or environment variables. If this project was ever shared with a real Groq or OpenAI key in the JSON, rotate those keys in the provider dashboards.
- n8n (Cloud or self-hosted)
- Qdrant instance and API credentials
- OpenAI API key (chat model + embeddings)
- Groq API key (speech-to-text HTTP node)
- Optional: a static frontend (e.g. served from
localhost) that POSTs audio to the webhook path configured in Receive Voice Recording
Each image below is used once and shows a different part of the system (UI vs. n8n execution panels).
The client shows recording status and the text returned from the automation.
Webhook → Groq transcription → AI Agent (OpenAI) → response to the caller; executions listed in the sidebar.
Output panel for the AI Agent step after a successful execution (example assistant text in the logs).
Send Response to Webhook shapes the payload (e.g. response field) your UI consumes.
Execution detail for the agent leg of the pipeline (duration and token usage in the logs).
- In n8n: Workflows → Import from File and select the JSON workflow.
- Create credentials for Qdrant, OpenAI, and set the Groq
Authorization: Bearer …header on Convert Speech to Text to a secure value (prefer n8n credentials or expressions, not a committed secret). - Activate the workflow; open the Form Trigger URL to ingest documents and the Webhook URL for voice requests.
- Point your frontend at the production webhook URL and ensure the binary field name matches what Convert Speech to Text expects (e.g.
audio).
Add a LICENSE file in this repository if you want to specify terms for reuse.




