I build end-to-end data pipelines and analytical systems using Python and SQL, with a focus on data modeling and orchestration.
Most of my work sits at the intersection of:
- Data Engineering
- Analytical Modeling
- Personal Data Exploration
I particularly enjoy analytics pipelines and tools around things I find worth tracking, viz. cricket and music, mainly.
End-to-end ELT data pipeline for ball-by-ball cricket match data using Python, PostgreSQL, dbt, and Airflow.
- Incremental ingestion & loading
- Layered warehouse modeling
- Fully orchestrated transformations via Airflow (Astronomer Cosmos)
🔗 https://github.com/shsiddhant/cricket-warehouse
memory.fm is a web application for exploring music listening history from Last.fm and Spotify.
Instead of focusing only on aggregate stats, it surfaces long-term and local patterns such as attachment, repetition, and obsessive listening, to help you revisit periods of your life through music.
🔗 https://github.com/shsiddhant/memory.fm
A PySide6 desktop application that reimagines your chat history into a book-like reading experience, with chapters, navigation, and structure instead of endless scrolling.
🔗 https://github.com/shsiddhant/memory.text
Machine learning project predicting match outcomes using features engineered from historical match data.
🔗 https://github.com/shsiddhant/womens-wc
Lightweight offline journaling application with password protection and Markdown support.
🔗 https://github.com/shsiddhant/memory.journal
Core: Python • SQL • PostgreSQL • dbt • Airflow
Back-end & Data: FastAPI • Pandas • NumPy
Front-end & Infra: React • Docker • Git
- Data engineering & pipeline design
- Analytical data modeling
- Sports analytics (especially cricket)
- Personal data products
- Python-based CLI tools
- Improving pipeline design and orchestration patterns.
- Performance optimization for data processing workflows.
- T20 Cricket analytics