🎵 SoniQue

AI-Powered Music Assistant That Understands You

Search · Play · Feel · Discover — all through natural conversation

49 tools · 12 moods · 1 intelligent agent · ∞ songs

Built with ❤️ by Team StarkMind

Jitin Kumar Sengar · Sonia · Vishakha Gaur · Yashwant Giri

📖 Table of Contents

Overview
Key Features
System Architecture
Project Structure
Tech Stack
Getting Started
API Reference
Tool Reference (49 Tools)
Mood Engine
Agent Configuration
Example Conversations
How It Works
Team
License

🌟 Overview

SoniQue is a conversational AI music assistant that lets you search, play, organise, discover, and feel music through natural language. Unlike traditional music players, SoniQue understands context, detects your mood, fetches lyrics, manages queues, tracks listening habits, and learns your taste — all powered by a single ReAct agent with 49 tools.

No buttons to click. No menus to navigate. Just tell SoniQue what you want.

You:  "I'm feeling sad, play something soothing"
SoniQue: Detects mood → Finds matching music → Plays it → Logs to history

One agent. One LLM call. Forty-nine tools. Zero friction.

✨ Key Features

🎭 Mood-Based Music

Detects your emotion from text using keyword-weighted scoring across 12 mood categories. Automatically selects genre-matched music and plays it.

🎤 Live Lyrics

Fetches plain-text and time-synced karaoke lyrics from lrclib.net and lyrics.ovh. Includes automatic mojibake repair for multilingual support (Hindi, Japanese, etc.).

📝 Smart Queue

Full music queue with shuffle, repeat one/all, play next, reorder, and queue history. Thread-safe with persistent state.

📊 Analytics Dashboard

Beautiful analytics showing top artists, top songs, genre distribution, listening streaks, peak hours, and mood patterns.

🧠 Conversation Memory

50-turn sliding window memory persisted to disk. Remembers context across messages — "play that song again" just works.

🔍 YouTube Search & Play

Natural language search across all of YouTube. Downloads audio, converts to MP3, and plays through a high-quality audio engine.

📋 Playlist Manager

Create unlimited playlists, add/remove songs, browse collections. All stored persistently in JSON.

❤️ Taste Learning

Tracks favourite genres, favourite artists, liked songs, and dislikes. Uses your taste profile to power personalised recommendations.

⚡ Smart Caching

Downloaded songs are cached with a JSON index tracking play counts and last-played timestamps. Replay = instant. Zero re-downloads.

🎵 Now Playing

Live playback status with animated sound bars, volume control, skip/seek, pause/resume — all controllable via natural language.

🏗 System Architecture

┌──────────────────────────────────────────────────────────────────┐
│                        USER (Browser)                            │
│                  Premium SPA · DM Sans + Space Grotesk           │
│         Particles · Ambient Orbs · Frosted Glass UI              │
└────────────────────────────┬─────────────────────────────────────┘
                             │  HTTP (REST API)
                             ▼
┌──────────────────────────────────────────────────────────────────┐
│                     FLASK BACKEND (app.py)                        │
│                                                                    │
│   /api/chat ─────► Lazy-loaded Agent (thread-safe singleton)       │
│   /api/status ───► Now Playing + Queue Summary + Message Count     │
│   /api/queue ────► Full Queue State                                │
│   /api/analytics ► Computed from Listening History                 │
│   /api/playlists ► Playlist Data                                   │
│   /api/preferences ► User Preferences                              │
│   /api/clear ────► Clear Conversation Memory                       │
│                                                                    │
└────────────────────────────┬─────────────────────────────────────┘
                             │
                             ▼
┌──────────────────────────────────────────────────────────────────┐
│              SONIQUE REACT AGENT (agent.py)                       │
│                                                                    │
│   Framework:  jentis ReAct Agent                                   │
│   LLM:        Google Gemini (gemini-3-flash-preview)               │
│   Memory:     ConversationMemory (50-turn sliding window)          │
│   Strategy:   Multi-step tool chaining with mandatory workflows    │
│                                                                    │
│   ┌──────────────────────────────────────────────────────┐        │
│   │                   49 REGISTERED TOOLS                 │        │
│   │                                                       │        │
│   │  ┌──────────┐  ┌──────────┐  ┌──────────┐           │        │
│   │  │ YouTube  │  │ Playback │  │ Playlist │           │        │
│   │  │ Search   │  │ Controls │  │ Manager  │           │        │
│   │  │ (2)      │  │ (9)      │  │ (6)      │           │        │
│   │  └──────────┘  └──────────┘  └──────────┘           │        │
│   │  ┌──────────┐  ┌──────────┐  ┌──────────┐           │        │
│   │  │Preference│  │  Mood    │  │ Lyrics   │           │        │
│   │  │ Tracker  │  │  Engine  │  │ Fetcher  │           │        │
│   │  │ (11)     │  │ (3)      │  │ (2)      │           │        │
│   │  └──────────┘  └──────────┘  └──────────┘           │        │
│   │  ┌──────────┐  ┌──────────┐                          │        │
│   │  │  Queue   │  │ History  │                          │        │
│   │  │ Manager  │  │Analytics │                          │        │
│   │  │ (10)     │  │ (6)      │                          │        │
│   │  └──────────┘  └──────────┘                          │        │
│   └──────────────────────────────────────────────────────┘        │
└────────────────────────────┬─────────────────────────────────────┘
                             │
                             ▼
┌──────────────────────────────────────────────────────────────────┐
│                     EXTERNAL SERVICES                             │
│                                                                    │
│   YouTube (via youtubesearchpython) — No API key needed           │
│   yt-dlp — Audio download from YouTube                            │
│   FFmpeg / imageio-ffmpeg — Audio conversion to MP3               │
│   pygame.mixer — 44100 Hz, 16-bit stereo playback engine          │
│   lrclib.net — Primary lyrics API (plain + synced)                │
│   lyrics.ovh — Fallback lyrics API                                │
│                                                                    │
└──────────────────────────────────────────────────────────────────┘

📁 Project Structure

SoniQue/
│
├── app.py                          # Flask REST API + SPA server
├── agent.py                        # ReAct Agent (49 tools, Gemini LLM)
├── README.md                       # This file
│
├── Tools/                          # All tool modules
│   ├── YouTubeSearchTool.py        # YouTube search + video metadata (2 tools)
│   ├── PlaybackTool.py             # Audio download, play, pause, seek (9 tools)
│   ├── PlaylistTool.py             # CRUD playlist management (6 tools)
│   ├── PreferenceTool.py           # Favourites, likes, dislikes (11 tools)
│   ├── MoodTool.py                 # Mood detection + mood playlists (3 tools)
│   ├── LyricsTool.py              # Plain + synced lyrics fetcher (2 tools)
│   ├── QueueTool.py                # Queue with shuffle/repeat (10 tools)
│   └── HistoryTool.py              # Listening history + analytics (6 tools)
│
├── templates/
│   └── index.html                  # SPA HTML template
│
├── static/
│   ├── css/
│   │   └── style.css               # Complete UI design system
│   └── js/
│       └── app.js                  # SPA interactive logic
│
└── data/                           # Persistent JSON storage
    ├── conversation_memory.json    # Chat history (50-turn window)
    ├── playlists.json              # User playlists
    ├── preferences.json            # Taste profile (genres, artists, likes)
    ├── queue.json                  # Queue state (songs, shuffle, repeat)
    ├── listening_history.json      # Full play history with timestamps
    └── cache/
        └── songs/
            ├── _cache_index.json   # Download cache index
            └── *.mp3               # Cached audio files

🛠 Tech Stack

Layer	Technology	Purpose
Frontend	Vanilla JS SPA	Interactive single-page application
Styling	Custom CSS Design System	Indigo-emerald palette, ambient orbs, particles
Fonts	Space Grotesk + DM Sans	Modern display + body typography
Icons	Lucide Icons	200+ clean SVG icons
Markdown	marked.js	Render agent responses as rich HTML
Backend	Flask 3.x	REST API server
AI Agent	jentis ReAct Framework	Reasoning + Acting agent loop
LLM	Google Gemini (gemini-3-flash-preview)	Language understanding & generation
Search	youtubesearchpython	YouTube video search (no API key)
Download	yt-dlp	Audio extraction from YouTube
Conversion	FFmpeg / imageio-ffmpeg	Audio format conversion to MP3
Playback	pygame.mixer	High-quality local audio playback
Lyrics	lrclib.net + lyrics.ovh	Plain + time-synced lyrics APIs
Storage	JSON files	Lightweight persistent data layer

🚀 Getting Started

Prerequisites

Python 3.12+
Google API Key with Gemini API enabled (Get one here)
FFmpeg (auto-detected; falls back to imageio-ffmpeg if not installed)

1. Clone & Setup

git clone <repo-url> SoniQue
cd SoniQue

# Create virtual environment
python -m venv .venv

# Activate (Windows)
.\.venv\Scripts\Activate.ps1

# Activate (macOS/Linux)
source .venv/bin/activate

2. Install Dependencies

pip install flask jentis pygame-ce yt-dlp imageio-ffmpeg google-generativeai youtubesearchpython

3. Set Environment Variables

# Required — your Google Gemini API key
set GOOGLE_API_KEY=your-google-api-key          # Windows
export GOOGLE_API_KEY=your-google-api-key        # macOS/Linux

# Optional — defaults shown
set GEMINI_MODEL=gemini-3-flash-preview
set PYTHONUTF8=1

4. Run

python app.py

Open http://127.0.0.1:5000 in your browser. That's it!

Alternative: Use as a Python Library

from agent import chat, get_memory, clear_memory

# Chat with SoniQue
response = chat("Play Bohemian Rhapsody")
print(response)

# Check memory
memory = get_memory()
print(f"Messages: {memory.message_count}")

# Clear conversation
clear_memory()

📡 API Reference

All endpoints return JSON. The server runs at http://127.0.0.1:5000.

Endpoint	Method	Description	Request Body	Response
`/`	`GET`	Serves the SPA UI	—	HTML
`/api/chat`	`POST`	Send message to AI agent	`{"message": "..."}`	`{"reply": "..."}`
`/api/status`	`GET`	Playback status + queue summary	—	`{now_playing, queue_count, ...}`
`/api/queue`	`GET`	Full queue state	—	`{queue, current_index, shuffle, repeat_mode}`
`/api/analytics`	`GET`	Listening statistics	—	`{total_plays, top_artists, ...}`
`/api/playlists`	`GET`	All playlists	—	`{playlist_name: [songs]}`
`/api/preferences`	`GET`	User taste profile	—	`{favourite_genres, liked_songs, ...}`
`/api/clear`	`POST`	Clear chat history	—	`{"ok": true}`

Example: Chat

curl -X POST http://127.0.0.1:5000/api/chat \
  -H "Content-Type: application/json" \
  -d '{"message": "Play some chill lo-fi music"}'

{
  "reply": "🎵 Now playing: **Lofi Hip Hop Radio** by ChilledCow..."
}

🔧 Tool Reference (49 Tools)

🔍 YouTube Search — 2 tools

Tool	Parameters	Description
`youtube_search`	`query`, `max_results=5`	Searches YouTube using `youtubesearchpython`. Returns video_id, title, URL, duration, views, channel, thumbnail. No API key required.
`youtube_video_details`	`video_url`	Gets detailed metadata — duration_seconds, views, category, description, keywords, is_live, is_family_safe. Accepts URL or bare video ID.

▶️ Playback Controls — 9 tools

Tool	Parameters	Description
`download_and_play_song`	`video_url`, `video_id`, `title`, `channel`	Downloads audio via yt-dlp, converts to MP3, plays through pygame.mixer. Uses song cache to skip re-downloads.
`pause_playback`	—	Pauses the currently playing song
`resume_playback`	—	Resumes a paused song
`stop_playback`	—	Stops playback and clears current song state
`set_volume`	`level` (0-100)	Sets playback volume
`get_now_playing`	—	Returns current song title, channel, status, and volume
`show_cached_songs`	—	Lists all cached songs with play counts
`forward_skip`	`seconds=10`	Skips forward in the current song
`backward_skip`	`seconds=10`	Skips backward in the current song

Implementation notes:

Song cache (SongCache) maintains a JSON index at data/cache/songs/_cache_index.json with download metadata, play counts, and timestamps
FFmpeg resolution: tries shutil.which("ffmpeg") → falls back to imageio_ffmpeg.get_ffmpeg_exe()
Playback engine: pygame.mixer at 44100 Hz, 16-bit, stereo, 4096 buffer
Thread-safe PlaybackState with threading.Lock

📋 Playlist Manager — 6 tools

Tool	Parameters	Description
`create_playlist`	`name`	Creates a new empty playlist
`delete_playlist`	`name`	Permanently deletes a playlist
`list_playlists`	—	Lists all playlists with song counts
`view_playlist`	`name`	Shows all songs in a playlist
`add_song_to_playlist`	`name`, `title`, `artist`, `video_id`, `video_url`	Adds a song to an existing playlist
`remove_song_from_playlist`	`name`, `index`	Removes a song by 0-based index

❤️ Preference Tracker — 11 tools

Tool	Parameters	Description
`add_favourite_genres`	`genres` (comma-separated)	Add genres to favourites
`remove_favourite_genres`	`genres`	Remove genres from favourites
`add_favourite_artists`	`artists` (comma-separated)	Add artists to favourites
`remove_favourite_artists`	`artists`	Remove artists from favourites
`like_song`	`title`, `artist`	Save a song to liked collection
`unlike_song`	`title`	Remove from liked songs
`add_disliked_artists`	`artists`	Mark artists as disliked
`remove_disliked_artists`	`artists`	Un-dislike artists
`add_disliked_genres`	`genres`	Mark genres as disliked
`remove_disliked_genres`	`genres`	Un-dislike genres
`get_preferences`	—	Returns full taste profile

🎭 Mood Engine — 3 tools

Tool	Parameters	Description
`detect_mood`	`text`	Detects user's mood from text using keyword-weighted scoring. Returns mood, confidence, recommended genres, energy level, and search queries.
`get_mood_playlist`	`mood`	Returns curated search queries and genre recommendations for any of the 12 supported moods.
`list_available_moods`	—	Lists all 12 mood categories with emojis, descriptions, genres, and energy levels.

🎤 Lyrics Fetcher — 2 tools

Tool	Parameters	Description
`get_lyrics`	`title`, `artist`	Fetches plain-text lyrics. Tries lrclib.net first, lyrics.ovh as fallback. Truncated to 3000 chars for context. Includes mojibake auto-repair.
`get_synced_lyrics`	`title`, `artist`	Fetches time-synced LRC-format lyrics (karaoke-style) from lrclib.net.

📝 Queue Manager — 10 tools

Tool	Parameters	Description
`add_to_queue`	`title`, `artist`, `video_id`, `video_url`	Adds a song to the end of the queue
`add_to_queue_next`	`title`, `artist`, `video_id`, `video_url`	Inserts a song right after current position
`remove_from_queue`	`index`	Removes a song by 0-based index
`view_queue`	—	Shows all queued songs with current position, shuffle/repeat status
`next_in_queue`	—	Gets next song (respects shuffle/repeat). Returns song info for playback.
`previous_in_queue`	—	Gets previous song in queue
`set_queue_shuffle`	`enabled`	Enable/disable shuffle mode
`set_queue_repeat`	`mode`	Set repeat: `"off"`, `"one"`, or `"all"`
`clear_queue`	—	Clears all songs from queue
`move_in_queue`	`from_index`, `to_index`	Reorder a song's position

📊 History & Analytics — 6 tools

Tool	Parameters	Description
`log_song_play`	`title`, `artist`, `video_id`, `genre`, `mood`	Logs play event with timestamp, hour, day-of-week
`get_listening_history`	`limit=20`	Returns recent plays, newest first
`get_listening_stats`	—	Full analytics: top artists/songs, genre/mood distribution, hourly/daily patterns, streaks (current + longest), total hours
`get_recently_played`	`limit=5`	Most recent unique songs (deduplicated)
`clear_listening_history`	—	Clears all history data
`get_music_taste_summary`	—	AI-readable taste profile summary for powering recommendations

🎭 Mood Engine

SoniQue detects emotion from your text using keyword-weighted scoring and maps it to one of 12 mood categories, each with curated genres and search queries.

Mood	Emoji	Energy	Genres
Happy	😊	High	Pop, Dance, Funk, Disco
Sad	😢	Low	Ballad, Acoustic, Indie Folk
Energetic	⚡	Very High	EDM, Hip-Hop, Rock, Drum & Bass
Relaxed	🧘	Low	Lo-fi, Ambient, Jazz
Romantic	💕	Medium	R&B, Soul, Soft Pop
Angry	😤	Very High	Metal, Punk, Hard Rock
Nostalgic	📼	Medium	80s, 90s, Classic Rock
Focused	🧠	Low-Med	Lo-fi, Classical, Ambient
Party	🎉	Very High	Dance, EDM, Latin
Sleepy	😴	Very Low	Ambient, Sleep, Piano
Motivated	🔥	High	Hip-Hop, Rock, Anthems
Melancholic	🌧️	Low	Post-Rock, Shoegaze, Dream Pop

How mood detection works:

User text is scanned against keyword dictionaries for each mood
Keyword matches are scored by word length (longer = more specific = higher weight)
Confidence = best_score / total_score
The detected mood maps to curated YouTube search queries
Agent automatically searches, plays, and logs with mood metadata

⚙️ Agent Configuration

Parameter	Value
Framework	jentis ReAct (Reasoning + Acting)
LLM	Google Gemini `gemini-3-flash-preview`
Temperature	`0.4` (balanced creativity + accuracy)
Max Tokens	`8,192`
Memory	50-turn sliding window (100 messages)
Tools	49 registered across 8 modules
Strategy	Multi-step tool chaining with mandatory workflows

Mandatory Agent Workflows

The agent follows strict multi-step workflows to ensure complete task execution:

User Request	Agent Workflow
"Play <song>"	`youtube_search` → `download_and_play_song` → `log_song_play`
"I feel sad"	`detect_mood` → `get_mood_playlist` → `youtube_search` → `download_and_play_song` → `log_song_play`
"Show lyrics"	`get_now_playing` → `get_lyrics`
"Recommend songs"	`get_preferences` + `get_music_taste_summary` → personalised suggestions
"Play next"	`next_in_queue` → `download_and_play_song` → `log_song_play`
"My stats"	`get_listening_stats` → formatted analytics

💬 Example Conversations

🎵 Playing Music

You:     "Play Bohemian Rhapsody"
SoniQue: 🎵 Now playing: Bohemian Rhapsody by Queen! Rock on! 🎸

🎭 Mood-Based Playback

You:     "I'm feeling really stressed and need to relax"
SoniQue: I can sense you need some calm vibes 😌 Let me play something relaxing...
         🎵 Now playing: Weightless by Marconi Union — the most relaxing song ever!

🎤 Fetching Lyrics

You:     "Show me the lyrics for this song"
SoniQue: Here are the lyrics for "Bohemian Rhapsody" by Queen:
         
         Is this the real life? Is this just fantasy?
         Caught in a landslide, no escape from reality...

📝 Queue Management

You:     "Queue up some party songs"
SoniQue: 🎉 I've added 3 party bangers to your queue:
         1. Uptown Funk — Bruno Mars
         2. Don't Stop Me Now — Queen
         3. Levitating — Dua Lipa
         Want me to start playing?

📊 Analytics

You:     "What are my listening stats?"
SoniQue: 📊 Here's your music journey:
         • Total Plays: 47 songs across 12 artists
         • Top Artist: Queen (8 plays)
         • Top Genre: Rock 🎸
         • Peak Hour: 6:00 PM
         • Listening Streak: 5 days! 🔥

❤️ Learning Preferences

You:     "I love jazz and lo-fi but hate heavy metal"
SoniQue: Got it! Added jazz and lo-fi to your favourites ❤️
         and heavy metal to your dislikes. I'll keep this in mind
         for recommendations!

⚡ How It Works

Request Flow

1. User types message in the chat UI
2. Frontend sends POST to /api/chat
3. Flask lazy-loads the agent (first request only)
4. Agent receives message + conversation context
5. Gemini LLM reasons about which tools to call
6. Agent executes tools sequentially (multi-step chaining)
7. Agent formats a natural language response
8. Response sent back to frontend as JSON
9. Frontend renders markdown response in chat bubble
10. Status polling updates now-playing and queue sidebar

Key Design Decisions

Decision	Rationale
Single agent, no sub-agents	Faster response times, simpler architecture, one LLM call per request
Lazy agent loading	Heavy imports (pygame, yt-dlp) only loaded on first chat request
Thread-safe singleton	Multiple browser tabs won't create duplicate agents
JSON file storage	Zero-config persistence, easy to debug, no database needed
UTF-8 mojibake repair	Windows' default `cp1252` encoding corrupts emojis — auto-fix in agent.py
Multi-step tool chaining	"Play X" requires search → download → log — agent enforces this
youtubesearchpython	No YouTube API key needed — works out of the box

👥 Team StarkMind

Hack KRMU 5.0

	Member	Role
J	Jitin Kumar Sengar	Team Lead
S	Sonia	Developer
V	Vishakha Gaur	Developer
Y	Yashwant Giri	Developer

📄 License

This project is licensed under the MIT License. See LICENSE for details.

Built with ❤️ by Team StarkMind · Hack KRMU 5.0

jentis · Google Gemini · Flask · pygame · yt-dlp

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
PPT		PPT
Tools		Tools
__pycache__		__pycache__
data		data
static		static
templates		templates
README.md		README.md
agent.py		agent.py
app.py		app.py
app_streamlit_backup.py		app_streamlit_backup.py

Folders and files

Latest commit

History

Repository files navigation

🎵 SoniQue

AI-Powered Music Assistant That Understands You

📖 Table of Contents

🌟 Overview

✨ Key Features

🎭 Mood-Based Music

🎤 Live Lyrics

📝 Smart Queue

📊 Analytics Dashboard

🧠 Conversation Memory

🔍 YouTube Search & Play

📋 Playlist Manager

❤️ Taste Learning

⚡ Smart Caching

🎵 Now Playing

🏗 System Architecture

📁 Project Structure

🛠 Tech Stack

🚀 Getting Started

Prerequisites

1. Clone & Setup

2. Install Dependencies

3. Set Environment Variables

4. Run

Alternative: Use as a Python Library

📡 API Reference

Example: Chat

🔧 Tool Reference (49 Tools)

🔍 YouTube Search — 2 tools

▶️ Playback Controls — 9 tools

📋 Playlist Manager — 6 tools

❤️ Preference Tracker — 11 tools

🎭 Mood Engine — 3 tools

🎤 Lyrics Fetcher — 2 tools

📝 Queue Manager — 10 tools

📊 History & Analytics — 6 tools

🎭 Mood Engine

⚙️ Agent Configuration

Mandatory Agent Workflows

💬 Example Conversations

🎵 Playing Music

🎭 Mood-Based Playback

🎤 Fetching Lyrics

📝 Queue Management

📊 Analytics

❤️ Learning Preferences

⚡ How It Works

Request Flow

Key Design Decisions

👥 Team StarkMind

Hack KRMU 5.0

📄 License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages