Skip to content
View PabloFerrerGonzalez333's full-sized avatar

Block or report PabloFerrerGonzalez333

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pablo Ferrer González

role at loc

Data Scientist with 3+ years of experience, currently on the AI team at SDG Group for Santander España — working on the Speech Analytics platform, where I take audio and turn it into analytics with LLMs. I build production data systems; the hardest part is making them boring to operate.

B.Sc. in Data Science · University of Valencia · Top 5% · GPA 8.5/10


Currently

  • Building the analytics side of Santander's Speech Analytics platform on Azure Databricks & PySpark
  • Shipped LLM Analytics enhancements to production.
  • Working with RAG, Prompt Engineering, FastAPI, and MLflow
  • Goal: Full Stack AI Engineer

Stack

Category Tools
AI & ML LLMs RAG Prompt Engineering TensorFlow Hugging Face MLflow
Data Science Python PySpark FastAPI Streamlit R
Cloud & DB Azure Databricks AWS Snowflake PostgreSQL SQL
DevOps Git GitHub Actions Jenkins Gluon OpenShift

Highlights

4th 1st 6 8.5
Cajamar Datathon 2024
(best AI solution)
Visualization Contest
2021
concurrent workstreams
managed at SDG
GPA — top 5%
UV Data Science

Certifications

  • Deep Learning · DeepLearning.AI
  • TensorFlow Developer · DeepLearning.AI
  • ML Engineering for Production (MLOps) · DeepLearning.AI
  • Azure AI Fundamentals · Microsoft
  • Databricks Lakehouse Platform · Databricks
  • Cambridge C1 Advanced

Spanish (native) · English (C1) · Catalan (C1) · French (A1)

pablo.ferrergonzalez.cd@gmail.com · LinkedIn

Pinned Loading

  1. gamma gamma Public

    An end-to-end Data Science and MLOps pipeline for predicting customer churn, featuring automated feature engineering, MLflow tracking, and a production-grade FastAPI serving layer.

    HTML

  2. aggity aggity Public

    UniversityHack 2024 challenge solution focused on lot-level industrial analytics and predictive modeling.

    Python

  3. coulang coulang Public

    Privacy-first Streamlit app and CLI for analyzing Spanish/English language balance in two-person WhatsApp chats.

    Python

  4. quelque quelque Public

    s2t pipeline accesible via web app, multiprovider and no cache api_key, turns audios into downloadable transcripts and summaries.

    Python

  5. mandarine mandarine Public

    Anomaly detection in idustrial images: final degree thesis, anomalib library benchmark, and reproducible inference pipeline.

    Jupyter Notebook

  6. kappa kappa Public

    End-to-end multivariate CO2 forecasting case study with Darts, model benchmarking, reproducible artifacts, Sphinx docs, and a lightweight FastAPI app.

    HTML