Skip to content
View MatteoMirgone's full-sized avatar

Block or report MatteoMirgone

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
MatteoMirgone/README.md

Matteo Mirgone

MS Applied Data Science @ The University of Chicago · Business Engineering @ KU Leuven


About

I work at the intersection of data science and business strategy — building models that translate cleanly into decisions for non-technical stakeholders. My training spans classical statistics (OLS, GLM, time series), applied machine learning, and modern NLP (transformer fine-tuning, RAG pipelines, topic modeling at scale). I'm most interested in roles where quantitative work has to make it out of the notebook and into a conversation with an operator, an analyst, or a founder.

I grew up across six countries, hold dual EU/US citizenship, and work in four languages. That background makes me particularly drawn to companies operating across borders — marketplaces, fintech, mobility, and logistics platforms where the data is messy in more than one jurisdiction at a time.

Currently

  • Spring 2026 capstone with HERE Technologies — building an AI agent that automates quality checks on operator-resolved support tickets (team "Ground Truth").
  • Coursework in Generative AI (RAG systems, embeddings) and ML II (optimization, regularization, transformer architectures).
  • Portfolio build-out — migrating academic and consulting projects into public repositories with proper documentation.

Featured projects

Project Domain Stack Repo
Carbon Intensity of Electricity — Cross-country OLS Energy · Econometrics Python · statsmodels · HC3 robust SE carbon-intensity-regression
AI Impact Analysis — 191K news articles, 2022–2026 NLP at scale BERTopic · DistilBERT fine-tuning · spaCy NER ai-impact-analysis
WELFake Fake News Detection Applied ML scikit-learn · Ridge Classifier · text pipelines welfake-fake-news-detection
CTA Ridership Forecasting Time series SARIMA · GARCH · Prophet · LSTM cta-ridership-forecasting
HERE Places: Customer Issue Management LLM agents RAG · evaluation · workflow automation in progress

Each repo includes a detailed README, reproducibility instructions, and methodology notes. For write-ups of projects I can't open-source (company take-homes, consulting engagements), I'm happy to share privately on request.

Technical toolkit

Languages Python · R · SQL · LaTeX Machine learning scikit-learn · statsmodels · PyTorch · Hugging Face Transformers NLP transformers · spaCy · BERTopic · ChromaDB · sentence-transformers Time series ARIMA / SARIMA · GARCH · Prophet · LSTM · DeepAR · N-BEATS · TimeGPT Data & BI pandas · NumPy · Power BI · Microsoft Fabric · ETL pipelines Visualization matplotlib · seaborn · Plotly Tooling Git · Jupyter · Linux · Docker (intermediate)

Background

Education

  • The University of Chicago — MS in Applied Data Science (2025–2026) — merit scholarship
  • KU Leuven — Business Engineering

Selected experience

  • Data & Business Intelligence Intern — SMT Belgium (Power BI, Microsoft Fabric, ETL, governance standards)
  • Consultant — UChicago Gargoyle Consulting Club (healthtech AI startup engagement)
  • Research contributor — KU Leuven × NGO Sabore's Well (humanitarian water pricing, Kenya)
  • PIP Consultant — Ray & Jules Sustainable Coffee (competitive intelligence)
  • Secretary & Marketing — Rotaract KU Leuven

Languages English · Dutch · Italian · French

Get in touch


Pinned Loading

  1. ai-impact-analysis ai-impact-analysis Public

    End-to-end NLP pipeline on 191K news articles: BERTopic topic modeling, spaCy NER, DistilBERT sentiment fine-tuning (95.6% acc), and temporal industry analysis.

    Jupyter Notebook

  2. carbon-intensity-regression carbon-intensity-regression Public

    Cross-country OLS regression of electricity carbon intensity on energy-mix composition.

    Jupyter Notebook

  3. welfake-fake-news-detection welfake-fake-news-detection Public

    Fake news classification on WELFake benchmark (14 classifiers, TF-IDF + handcrafted features, unsupervised analysis). Ridge Classifier F1 = 0.977, validated via 5-fold CV + learning curves.

    Jupyter Notebook

  4. cta-ridership-forecasting cta-ridership-forecasting Public

    Daily CTA ridership forecasting (SARIMA, GARCH, Prophet, LSTM) with a Chow / SARIMAX / Bayesian Causal Impact analysis of COVID-19's permanent effect.

    Jupyter Notebook