João Vitor Martins joaovitormsilva

👋 Hi there! Welcome to my GitHub profile

I'm João Vitor Martins da Silva

🎓 Bachelor's Student in Information Systems at University of São Paulo (USP)
💼 Data Engineer at F1rst Digital Services

💡 About Me

I'm a Data Engineer passionate about building scalable data pipelines and productionizing machine learning models. With a technical background in Electronics and hands-on experience in automated testing, I now focus on data engineering, cloud optimization, and AI/ML operations.

Key Achievements:

🚀 Productionized 80+ machine learning models using Kedro and Azure Databricks
💰 Reduced cloud infrastructure costs by 34% through systematic optimization
✅ Maintained 96% regression test coverage for critical financial applications
📊 Built real-time data pipelines processing data from multiple sources

🔧 Tech Stack

Languages & Frameworks:

Cloud & Big Data:

Tools & DevOps:

Expertise:

Data Engineering: ETL/ELT Pipelines, Data Modeling, Data Quality
ML Operations: Model Deployment, Kedro, Control-M Orchestration
Cloud Optimization: FinOps, Resource Management, Cost Reduction
Programming: Python, PySpark, SQL

📌 Featured Projects

🏦 FinData-Intelligence

An end-to-end Data Engineering & AI platform for automated bank statement processing, intelligent expense classification using LLMs, and investment portfolio analytics.

Tech Stack: Python, LLMs, Data Engineering, AI

📊 Vendas-Livrarias (Bookstore Sales Analysis)

Complete project for data ingestion, analysis, and quality testing of sales data using Spark and PySpark on Databricks platform.

Tech Stack: PySpark, Databricks, Data Quality Testing

🌡️ Monitoramento-Sensor-IoT (IoT Sensor Monitoring)

Real-time data processing system for IoT sensors using Kafka for messaging, Spark Structured Streaming for processing, and PostgreSQL for storage.

Tech Stack: Kafka, Spark Streaming, PostgreSQL, Docker, Python

📚 Certifications & Training

Fundamentals of Data Engineering — Joe Reis
Practical Deep Learning for Coders — Jeremy Howard (Fast.ai)
Data Science, Spark & Data Visualization — Alura
Software Quality — Federado Foundation & Professional
DevOps & Git — F1rst Digital Services

🌐 Connect With Me

📧 Email: joao.vitormsilva@usp.br
💼 LinkedIn: linkedin.com/in/vitorjoao
🌍 Location: São Paulo, Brazil

📊 GitHub Stats

🎯 Current Focus

🔭 Building scalable data pipelines with Azure Databricks
🌱 Learning advanced MLOps and real-time streaming architectures
👯 Looking to collaborate on open-source data engineering projects

⭐️ If you're working on innovative data engineering challenges or looking for a passionate data engineer with proven results, let's connect!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly