Skip to content

devXjitin/SoniQue

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

8 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🎡 SoniQue

AI-Powered Music Assistant That Understands You

Search Β· Play Β· Feel Β· Discover β€” all through natural conversation

Python Flask Gemini Hack KRMU 5.0


49 tools · 12 moods · 1 intelligent agent · ∞ songs


Built with ❀️ by Team StarkMind

Jitin Kumar Sengar Β· Sonia Β· Vishakha Gaur Β· Yashwant Giri


πŸ“– Table of Contents


🌟 Overview

SoniQue is a conversational AI music assistant that lets you search, play, organise, discover, and feel music through natural language. Unlike traditional music players, SoniQue understands context, detects your mood, fetches lyrics, manages queues, tracks listening habits, and learns your taste β€” all powered by a single ReAct agent with 49 tools.

No buttons to click. No menus to navigate. Just tell SoniQue what you want.

You:  "I'm feeling sad, play something soothing"
SoniQue: Detects mood β†’ Finds matching music β†’ Plays it β†’ Logs to history

One agent. One LLM call. Forty-nine tools. Zero friction.


✨ Key Features

🎭 Mood-Based Music

Detects your emotion from text using keyword-weighted scoring across 12 mood categories. Automatically selects genre-matched music and plays it.

🎀 Live Lyrics

Fetches plain-text and time-synced karaoke lyrics from lrclib.net and lyrics.ovh. Includes automatic mojibake repair for multilingual support (Hindi, Japanese, etc.).

πŸ“ Smart Queue

Full music queue with shuffle, repeat one/all, play next, reorder, and queue history. Thread-safe with persistent state.

πŸ“Š Analytics Dashboard

Beautiful analytics showing top artists, top songs, genre distribution, listening streaks, peak hours, and mood patterns.

🧠 Conversation Memory

50-turn sliding window memory persisted to disk. Remembers context across messages β€” "play that song again" just works.

πŸ” YouTube Search & Play

Natural language search across all of YouTube. Downloads audio, converts to MP3, and plays through a high-quality audio engine.

πŸ“‹ Playlist Manager

Create unlimited playlists, add/remove songs, browse collections. All stored persistently in JSON.

❀️ Taste Learning

Tracks favourite genres, favourite artists, liked songs, and dislikes. Uses your taste profile to power personalised recommendations.

⚑ Smart Caching

Downloaded songs are cached with a JSON index tracking play counts and last-played timestamps. Replay = instant. Zero re-downloads.

🎡 Now Playing

Live playback status with animated sound bars, volume control, skip/seek, pause/resume β€” all controllable via natural language.


πŸ— System Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                        USER (Browser)                            β”‚
β”‚                  Premium SPA Β· DM Sans + Space Grotesk           β”‚
β”‚         Particles Β· Ambient Orbs Β· Frosted Glass UI              β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                             β”‚  HTTP (REST API)
                             β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                     FLASK BACKEND (app.py)                        β”‚
β”‚                                                                    β”‚
β”‚   /api/chat ─────► Lazy-loaded Agent (thread-safe singleton)       β”‚
β”‚   /api/status ───► Now Playing + Queue Summary + Message Count     β”‚
β”‚   /api/queue ────► Full Queue State                                β”‚
β”‚   /api/analytics β–Ί Computed from Listening History                 β”‚
β”‚   /api/playlists β–Ί Playlist Data                                   β”‚
β”‚   /api/preferences β–Ί User Preferences                              β”‚
β”‚   /api/clear ────► Clear Conversation Memory                       β”‚
β”‚                                                                    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                             β”‚
                             β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              SONIQUE REACT AGENT (agent.py)                       β”‚
β”‚                                                                    β”‚
β”‚   Framework:  jentis ReAct Agent                                   β”‚
β”‚   LLM:        Google Gemini (gemini-3-flash-preview)               β”‚
β”‚   Memory:     ConversationMemory (50-turn sliding window)          β”‚
β”‚   Strategy:   Multi-step tool chaining with mandatory workflows    β”‚
β”‚                                                                    β”‚
β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”        β”‚
β”‚   β”‚                   49 REGISTERED TOOLS                 β”‚        β”‚
β”‚   β”‚                                                       β”‚        β”‚
β”‚   β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”           β”‚        β”‚
β”‚   β”‚  β”‚ YouTube  β”‚  β”‚ Playback β”‚  β”‚ Playlist β”‚           β”‚        β”‚
β”‚   β”‚  β”‚ Search   β”‚  β”‚ Controls β”‚  β”‚ Manager  β”‚           β”‚        β”‚
β”‚   β”‚  β”‚ (2)      β”‚  β”‚ (9)      β”‚  β”‚ (6)      β”‚           β”‚        β”‚
β”‚   β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜           β”‚        β”‚
β”‚   β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”           β”‚        β”‚
β”‚   β”‚  β”‚Preferenceβ”‚  β”‚  Mood    β”‚  β”‚ Lyrics   β”‚           β”‚        β”‚
β”‚   β”‚  β”‚ Tracker  β”‚  β”‚  Engine  β”‚  β”‚ Fetcher  β”‚           β”‚        β”‚
β”‚   β”‚  β”‚ (11)     β”‚  β”‚ (3)      β”‚  β”‚ (2)      β”‚           β”‚        β”‚
β”‚   β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜           β”‚        β”‚
β”‚   β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                          β”‚        β”‚
β”‚   β”‚  β”‚  Queue   β”‚  β”‚ History  β”‚                          β”‚        β”‚
β”‚   β”‚  β”‚ Manager  β”‚  β”‚Analytics β”‚                          β”‚        β”‚
β”‚   β”‚  β”‚ (10)     β”‚  β”‚ (6)      β”‚                          β”‚        β”‚
β”‚   β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                          β”‚        β”‚
β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                             β”‚
                             β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                     EXTERNAL SERVICES                             β”‚
β”‚                                                                    β”‚
β”‚   YouTube (via youtubesearchpython) β€” No API key needed           β”‚
β”‚   yt-dlp β€” Audio download from YouTube                            β”‚
β”‚   FFmpeg / imageio-ffmpeg β€” Audio conversion to MP3               β”‚
β”‚   pygame.mixer β€” 44100 Hz, 16-bit stereo playback engine          β”‚
β”‚   lrclib.net β€” Primary lyrics API (plain + synced)                β”‚
β”‚   lyrics.ovh β€” Fallback lyrics API                                β”‚
β”‚                                                                    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ“ Project Structure

SoniQue/
β”‚
β”œβ”€β”€ app.py                          # Flask REST API + SPA server
β”œβ”€β”€ agent.py                        # ReAct Agent (49 tools, Gemini LLM)
β”œβ”€β”€ README.md                       # This file
β”‚
β”œβ”€β”€ Tools/                          # All tool modules
β”‚   β”œβ”€β”€ YouTubeSearchTool.py        # YouTube search + video metadata (2 tools)
β”‚   β”œβ”€β”€ PlaybackTool.py             # Audio download, play, pause, seek (9 tools)
β”‚   β”œβ”€β”€ PlaylistTool.py             # CRUD playlist management (6 tools)
β”‚   β”œβ”€β”€ PreferenceTool.py           # Favourites, likes, dislikes (11 tools)
β”‚   β”œβ”€β”€ MoodTool.py                 # Mood detection + mood playlists (3 tools)
β”‚   β”œβ”€β”€ LyricsTool.py              # Plain + synced lyrics fetcher (2 tools)
β”‚   β”œβ”€β”€ QueueTool.py                # Queue with shuffle/repeat (10 tools)
β”‚   └── HistoryTool.py              # Listening history + analytics (6 tools)
β”‚
β”œβ”€β”€ templates/
β”‚   └── index.html                  # SPA HTML template
β”‚
β”œβ”€β”€ static/
β”‚   β”œβ”€β”€ css/
β”‚   β”‚   └── style.css               # Complete UI design system
β”‚   └── js/
β”‚       └── app.js                  # SPA interactive logic
β”‚
└── data/                           # Persistent JSON storage
    β”œβ”€β”€ conversation_memory.json    # Chat history (50-turn window)
    β”œβ”€β”€ playlists.json              # User playlists
    β”œβ”€β”€ preferences.json            # Taste profile (genres, artists, likes)
    β”œβ”€β”€ queue.json                  # Queue state (songs, shuffle, repeat)
    β”œβ”€β”€ listening_history.json      # Full play history with timestamps
    └── cache/
        └── songs/
            β”œβ”€β”€ _cache_index.json   # Download cache index
            └── *.mp3               # Cached audio files

πŸ›  Tech Stack

Layer Technology Purpose
Frontend Vanilla JS SPA Interactive single-page application
Styling Custom CSS Design System Indigo-emerald palette, ambient orbs, particles
Fonts Space Grotesk + DM Sans Modern display + body typography
Icons Lucide Icons 200+ clean SVG icons
Markdown marked.js Render agent responses as rich HTML
Backend Flask 3.x REST API server
AI Agent jentis ReAct Framework Reasoning + Acting agent loop
LLM Google Gemini (gemini-3-flash-preview) Language understanding & generation
Search youtubesearchpython YouTube video search (no API key)
Download yt-dlp Audio extraction from YouTube
Conversion FFmpeg / imageio-ffmpeg Audio format conversion to MP3
Playback pygame.mixer High-quality local audio playback
Lyrics lrclib.net + lyrics.ovh Plain + time-synced lyrics APIs
Storage JSON files Lightweight persistent data layer

πŸš€ Getting Started

Prerequisites

  • Python 3.12+
  • Google API Key with Gemini API enabled (Get one here)
  • FFmpeg (auto-detected; falls back to imageio-ffmpeg if not installed)

1. Clone & Setup

git clone <repo-url> SoniQue
cd SoniQue

# Create virtual environment
python -m venv .venv

# Activate (Windows)
.\.venv\Scripts\Activate.ps1

# Activate (macOS/Linux)
source .venv/bin/activate

2. Install Dependencies

pip install flask jentis pygame-ce yt-dlp imageio-ffmpeg google-generativeai youtubesearchpython

3. Set Environment Variables

# Required β€” your Google Gemini API key
set GOOGLE_API_KEY=your-google-api-key          # Windows
export GOOGLE_API_KEY=your-google-api-key        # macOS/Linux

# Optional β€” defaults shown
set GEMINI_MODEL=gemini-3-flash-preview
set PYTHONUTF8=1

4. Run

python app.py

Open http://127.0.0.1:5000 in your browser. That's it!

Alternative: Use as a Python Library

from agent import chat, get_memory, clear_memory

# Chat with SoniQue
response = chat("Play Bohemian Rhapsody")
print(response)

# Check memory
memory = get_memory()
print(f"Messages: {memory.message_count}")

# Clear conversation
clear_memory()

πŸ“‘ API Reference

All endpoints return JSON. The server runs at http://127.0.0.1:5000.

Endpoint Method Description Request Body Response
/ GET Serves the SPA UI β€” HTML
/api/chat POST Send message to AI agent {"message": "..."} {"reply": "..."}
/api/status GET Playback status + queue summary β€” {now_playing, queue_count, ...}
/api/queue GET Full queue state β€” {queue, current_index, shuffle, repeat_mode}
/api/analytics GET Listening statistics β€” {total_plays, top_artists, ...}
/api/playlists GET All playlists β€” {playlist_name: [songs]}
/api/preferences GET User taste profile β€” {favourite_genres, liked_songs, ...}
/api/clear POST Clear chat history β€” {"ok": true}

Example: Chat

curl -X POST http://127.0.0.1:5000/api/chat \
  -H "Content-Type: application/json" \
  -d '{"message": "Play some chill lo-fi music"}'
{
  "reply": "🎡 Now playing: **Lofi Hip Hop Radio** by ChilledCow..."
}

πŸ”§ Tool Reference (49 Tools)

πŸ” YouTube Search β€” 2 tools

Tool Parameters Description
youtube_search query, max_results=5 Searches YouTube using youtubesearchpython. Returns video_id, title, URL, duration, views, channel, thumbnail. No API key required.
youtube_video_details video_url Gets detailed metadata β€” duration_seconds, views, category, description, keywords, is_live, is_family_safe. Accepts URL or bare video ID.

▢️ Playback Controls β€” 9 tools

Tool Parameters Description
download_and_play_song video_url, video_id, title, channel Downloads audio via yt-dlp, converts to MP3, plays through pygame.mixer. Uses song cache to skip re-downloads.
pause_playback β€” Pauses the currently playing song
resume_playback β€” Resumes a paused song
stop_playback β€” Stops playback and clears current song state
set_volume level (0-100) Sets playback volume
get_now_playing β€” Returns current song title, channel, status, and volume
show_cached_songs β€” Lists all cached songs with play counts
forward_skip seconds=10 Skips forward in the current song
backward_skip seconds=10 Skips backward in the current song

Implementation notes:

  • Song cache (SongCache) maintains a JSON index at data/cache/songs/_cache_index.json with download metadata, play counts, and timestamps
  • FFmpeg resolution: tries shutil.which("ffmpeg") β†’ falls back to imageio_ffmpeg.get_ffmpeg_exe()
  • Playback engine: pygame.mixer at 44100 Hz, 16-bit, stereo, 4096 buffer
  • Thread-safe PlaybackState with threading.Lock

πŸ“‹ Playlist Manager β€” 6 tools

Tool Parameters Description
create_playlist name Creates a new empty playlist
delete_playlist name Permanently deletes a playlist
list_playlists β€” Lists all playlists with song counts
view_playlist name Shows all songs in a playlist
add_song_to_playlist name, title, artist, video_id, video_url Adds a song to an existing playlist
remove_song_from_playlist name, index Removes a song by 0-based index

❀️ Preference Tracker β€” 11 tools

Tool Parameters Description
add_favourite_genres genres (comma-separated) Add genres to favourites
remove_favourite_genres genres Remove genres from favourites
add_favourite_artists artists (comma-separated) Add artists to favourites
remove_favourite_artists artists Remove artists from favourites
like_song title, artist Save a song to liked collection
unlike_song title Remove from liked songs
add_disliked_artists artists Mark artists as disliked
remove_disliked_artists artists Un-dislike artists
add_disliked_genres genres Mark genres as disliked
remove_disliked_genres genres Un-dislike genres
get_preferences β€” Returns full taste profile

🎭 Mood Engine β€” 3 tools

Tool Parameters Description
detect_mood text Detects user's mood from text using keyword-weighted scoring. Returns mood, confidence, recommended genres, energy level, and search queries.
get_mood_playlist mood Returns curated search queries and genre recommendations for any of the 12 supported moods.
list_available_moods β€” Lists all 12 mood categories with emojis, descriptions, genres, and energy levels.

🎀 Lyrics Fetcher β€” 2 tools

Tool Parameters Description
get_lyrics title, artist Fetches plain-text lyrics. Tries lrclib.net first, lyrics.ovh as fallback. Truncated to 3000 chars for context. Includes mojibake auto-repair.
get_synced_lyrics title, artist Fetches time-synced LRC-format lyrics (karaoke-style) from lrclib.net.

πŸ“ Queue Manager β€” 10 tools

Tool Parameters Description
add_to_queue title, artist, video_id, video_url Adds a song to the end of the queue
add_to_queue_next title, artist, video_id, video_url Inserts a song right after current position
remove_from_queue index Removes a song by 0-based index
view_queue β€” Shows all queued songs with current position, shuffle/repeat status
next_in_queue β€” Gets next song (respects shuffle/repeat). Returns song info for playback.
previous_in_queue β€” Gets previous song in queue
set_queue_shuffle enabled Enable/disable shuffle mode
set_queue_repeat mode Set repeat: "off", "one", or "all"
clear_queue β€” Clears all songs from queue
move_in_queue from_index, to_index Reorder a song's position

πŸ“Š History & Analytics β€” 6 tools

Tool Parameters Description
log_song_play title, artist, video_id, genre, mood Logs play event with timestamp, hour, day-of-week
get_listening_history limit=20 Returns recent plays, newest first
get_listening_stats β€” Full analytics: top artists/songs, genre/mood distribution, hourly/daily patterns, streaks (current + longest), total hours
get_recently_played limit=5 Most recent unique songs (deduplicated)
clear_listening_history β€” Clears all history data
get_music_taste_summary β€” AI-readable taste profile summary for powering recommendations

🎭 Mood Engine

SoniQue detects emotion from your text using keyword-weighted scoring and maps it to one of 12 mood categories, each with curated genres and search queries.

Mood Emoji Energy Genres
Happy 😊 High Pop, Dance, Funk, Disco
Sad 😒 Low Ballad, Acoustic, Indie Folk
Energetic ⚑ Very High EDM, Hip-Hop, Rock, Drum & Bass
Relaxed 🧘 Low Lo-fi, Ambient, Jazz
Romantic πŸ’• Medium R&B, Soul, Soft Pop
Angry 😀 Very High Metal, Punk, Hard Rock
Nostalgic πŸ“Ό Medium 80s, 90s, Classic Rock
Focused 🧠 Low-Med Lo-fi, Classical, Ambient
Party πŸŽ‰ Very High Dance, EDM, Latin
Sleepy 😴 Very Low Ambient, Sleep, Piano
Motivated πŸ”₯ High Hip-Hop, Rock, Anthems
Melancholic 🌧️ Low Post-Rock, Shoegaze, Dream Pop

How mood detection works:

  1. User text is scanned against keyword dictionaries for each mood
  2. Keyword matches are scored by word length (longer = more specific = higher weight)
  3. Confidence = best_score / total_score
  4. The detected mood maps to curated YouTube search queries
  5. Agent automatically searches, plays, and logs with mood metadata

βš™οΈ Agent Configuration

Parameter Value
Framework jentis ReAct (Reasoning + Acting)
LLM Google Gemini gemini-3-flash-preview
Temperature 0.4 (balanced creativity + accuracy)
Max Tokens 8,192
Memory 50-turn sliding window (100 messages)
Tools 49 registered across 8 modules
Strategy Multi-step tool chaining with mandatory workflows

Mandatory Agent Workflows

The agent follows strict multi-step workflows to ensure complete task execution:

User Request Agent Workflow
"Play <song>" youtube_search β†’ download_and_play_song β†’ log_song_play
"I feel sad" detect_mood β†’ get_mood_playlist β†’ youtube_search β†’ download_and_play_song β†’ log_song_play
"Show lyrics" get_now_playing β†’ get_lyrics
"Recommend songs" get_preferences + get_music_taste_summary β†’ personalised suggestions
"Play next" next_in_queue β†’ download_and_play_song β†’ log_song_play
"My stats" get_listening_stats β†’ formatted analytics

πŸ’¬ Example Conversations

🎡 Playing Music

You:     "Play Bohemian Rhapsody"
SoniQue: 🎡 Now playing: Bohemian Rhapsody by Queen! Rock on! 🎸

🎭 Mood-Based Playback

You:     "I'm feeling really stressed and need to relax"
SoniQue: I can sense you need some calm vibes 😌 Let me play something relaxing...
         🎡 Now playing: Weightless by Marconi Union β€” the most relaxing song ever!

🎀 Fetching Lyrics

You:     "Show me the lyrics for this song"
SoniQue: Here are the lyrics for "Bohemian Rhapsody" by Queen:
         
         Is this the real life? Is this just fantasy?
         Caught in a landslide, no escape from reality...

πŸ“ Queue Management

You:     "Queue up some party songs"
SoniQue: πŸŽ‰ I've added 3 party bangers to your queue:
         1. Uptown Funk β€” Bruno Mars
         2. Don't Stop Me Now β€” Queen
         3. Levitating β€” Dua Lipa
         Want me to start playing?

πŸ“Š Analytics

You:     "What are my listening stats?"
SoniQue: πŸ“Š Here's your music journey:
         β€’ Total Plays: 47 songs across 12 artists
         β€’ Top Artist: Queen (8 plays)
         β€’ Top Genre: Rock 🎸
         β€’ Peak Hour: 6:00 PM
         β€’ Listening Streak: 5 days! πŸ”₯

❀️ Learning Preferences

You:     "I love jazz and lo-fi but hate heavy metal"
SoniQue: Got it! Added jazz and lo-fi to your favourites ❀️
         and heavy metal to your dislikes. I'll keep this in mind
         for recommendations!

⚑ How It Works

Request Flow

1. User types message in the chat UI
2. Frontend sends POST to /api/chat
3. Flask lazy-loads the agent (first request only)
4. Agent receives message + conversation context
5. Gemini LLM reasons about which tools to call
6. Agent executes tools sequentially (multi-step chaining)
7. Agent formats a natural language response
8. Response sent back to frontend as JSON
9. Frontend renders markdown response in chat bubble
10. Status polling updates now-playing and queue sidebar

Key Design Decisions

Decision Rationale
Single agent, no sub-agents Faster response times, simpler architecture, one LLM call per request
Lazy agent loading Heavy imports (pygame, yt-dlp) only loaded on first chat request
Thread-safe singleton Multiple browser tabs won't create duplicate agents
JSON file storage Zero-config persistence, easy to debug, no database needed
UTF-8 mojibake repair Windows' default cp1252 encoding corrupts emojis β€” auto-fix in agent.py
Multi-step tool chaining "Play X" requires search β†’ download β†’ log β€” agent enforces this
youtubesearchpython No YouTube API key needed β€” works out of the box

πŸ‘₯ Team StarkMind

Hack KRMU 5.0

Member Role
J Jitin Kumar Sengar Team Lead
S Sonia Developer
V Vishakha Gaur Developer
Y Yashwant Giri Developer

πŸ“„ License

This project is licensed under the MIT License. See LICENSE for details.


Built with ❀️ by Team StarkMind · Hack KRMU 5.0

jentis Β· Google Gemini Β· Flask Β· pygame Β· yt-dlp

About

An Intelligent Music Agent built using a multi-agent architecture.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors