Transform YouTube videos into searchable knowledge with AI-powered transcription and semantic search
InsightFlow AI is an intelligent video processing system that extracts audio from YouTube videos, transcribes content using OpenAI's Whisper, and creates a searchable question-answering system using LangChain's RAG (Retrieval-Augmented Generation) capabilities.
- π― AI-Powered Transcription: Automatically transcribe YouTube videos using OpenAI's Whisper model
- π§ Semantic Search: Query video content using natural language questions
- π¬ Interactive Q&A: Ask specific questions and get relevant answers from the video content
- π Full Transcript Access: View and download complete transcriptions
- π Vector Database: Leverages ChromaDB for efficient semantic search
- π Modern UI: Clean, responsive interface built with Streamlit
- Frontend: Streamlit
- Transcription: OpenAI Whisper
- Vector Database: ChromaDB
- Embeddings: Sentence Transformers (all-MiniLM-L6-v2)
- LLM Framework: LangChain
- Video Processing: yt-dlp, FFmpeg
Before you begin, ensure you have the following installed:
- Python 3.8 or higher
- FFmpeg (Download here)
- pip (Python package manager)
git clone https://github.com/yourusername/insightflow-ai.git
cd insightflow-ai# Windows
python -m venv venv
venv\Scripts\activate
# macOS/Linux
python3 -m venv venv
source venv/bin/activatepip install -r requirements.txtWindows:
- Download FFmpeg from ffmpeg.org
- Extract to a directory (e.g.,
D:\ffmpeg\bin) - Add to System PATH or update the path in
processor.py
macOS:
brew install ffmpegLinux:
sudo apt update
sudo apt install ffmpegstreamlit run app.pyThe application will open in your default browser at http://localhost:8501
- Paste YouTube URL: Enter the URL of the YouTube video you want to analyze
- Click "Analyze Video": Wait for the processing to complete (1-3 minutes depending on video length)
- Ask Questions: Once processing is complete, ask questions about the video content
- View Transcript: Expand the transcript section to see the full text
- "What is the main topic of this video?"
- "Can you summarize the key points discussed?"
- "What does the speaker say about [specific topic]?"
- "What are the recommendations mentioned?"
insightflow-ai/
β
βββ app.py # Main Streamlit application
βββ processor.py # Video download and transcription logic
βββ brain.py # Vector database and RAG implementation
βββ requirements.txt # Python dependencies
βββ README.md # Project documentation
β
βββ .vscode/
β βββ launch.json # VS Code debug configuration
β
βββ venv/ # Virtual environment (not tracked)
βββ chroma_db/ # Vector database storage (generated)
βββ temp_audio.mp3 # Temporary audio files (generated)
If FFmpeg is not in your system PATH, update the path in processor.py:
os.environ["PATH"] += os.pathsep + r"YOUR_FFMPEG_PATH\bin"You can change the Whisper model for different accuracy/speed tradeoffs in processor.py:
# Options: tiny, base, small, medium, large
model = whisper.load_model("base") # Change "base" to your preferred model| Model | Speed | Accuracy | Use Case |
|---|---|---|---|
| tiny | β‘β‘β‘ | ββ | Quick testing |
| base | β‘β‘ | βββ | Default, balanced |
| small | β‘ | ββββ | Better accuracy |
| medium | π | βββββ | High accuracy |
| large | ππ | βββββ | Best accuracy |
- Open
app.pyin VS Code - Press
F5or click "Run and Debug" - Select "Python: Streamlit" configuration
Test Processor:
python processor.pyTest Brain (Vector DB):
python brain.pyyt-dlp # YouTube video downloader
openai-whisper # Audio transcription
langchain-text-splitters # Text chunking
langchain-community # LangChain integrations
langchain-core # LangChain core functionality
chromadb # Vector database
sentence-transformers # Text embeddings
torch # PyTorch for ML models
streamlit # Web interface- π Educational Content: Extract key information from lectures and tutorials
- ποΈ Podcast Analysis: Search through podcast episodes for specific topics
- πΊ Video Research: Quickly find relevant sections in long-form content
- π Meeting Recordings: Create searchable transcripts of recorded meetings
- π¬ Content Creation: Analyze competitor videos or research topics
- Support for multiple video sources (Vimeo, local files)
- Multi-language support
- Export functionality (PDF, DOCX)
- Timestamp-based search results
- Video player integration with auto-jump to relevant sections
- Batch processing for multiple videos
- Advanced analytics dashboard
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- OpenAI Whisper for the incredible speech recognition model
- LangChain for the RAG framework
- Streamlit for the easy-to-use web framework
- ChromaDB for the vector database
Minhajul Islam Nion
- Email: minhajulislamnion@gmail.com
- University: University of Canberra
- LinkedIn: Nion007
- GitHub: @Nion9
For questions or feedback, please reach out via email or open an issue on GitHub.
