Skip to content
View kajinmo's full-sized avatar

Block or report kajinmo

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
kajinmo/README.md

Welcome to my profile!

I am a Data Engineer with a solid 7-year foundation in data analytics, now fully specialized in building robust, scalable data pipelines and automating complex processes.

My current focus is on designing progressive ETL/ELT architectures, leveraging serverless solutions, and ensuring high data quality and reliability. I am constantly building end-to-end projects that bridge the gap between raw data and business value, applying software engineering best practices to data ecosystems.

Tech Stack & Capabilities

  • Languages: Python, SQL
  • Cloud (AWS & GCP): S3, EC2, Lambda, Glue, Athena, SQS, SNS, DynamoDB, EMR, Redshift, IAM | BigQuery
  • Processing & Orchestration: Airflow, PySpark, Spark, Apache Iceberg
  • Data Quality & Validation: Pydantic, Pandera
  • Infrastructure & Deployment: Docker, Terraform
  • Databases & Storage: Relational (SQL), NoSQL, Data Warehousing
  • Data Viz & Applications: Power BI, Tableau, Streamlit, Plotly/Dash

What I'm Currently Working On

  • Developing a progressive portfolio of serverless data pipelines (from extraction to consumption).
  • Implementing modern data lakehouse architectures using Apache Iceberg and AWS analytics services.
  • Applying advanced data quality and validation checks within automated workflows.

Education

  • MBA in Data Science and Analytics - Universidade de São Paulo (USP)
  • Bachelor's Degree in Science & Technology - Universidade Federal do ABC (UFABC)



Pinned Loading

  1. etl-api-aws-lambda-sqs etl-api-aws-lambda-sqs Public

    This project establishes an End-to-End Serverless Data Engineering pipeline fully hosted on AWS using the Always Free Tier. It automates the extraction of raw events from the public GitHub API, ing…

    Python 1

  2. airflow-localstack airflow-localstack Public

    This repository sets up a local development environment with Apache Airflow and LocalStack, using the Astronomer CLI. Ideal for testing DAGs that interact with AWS services such as S3, SQS, SNS and…

    Python 5 3

  3. lightweight-etl-pipeline-to-gcp lightweight-etl-pipeline-to-gcp Public

    An ETL pipeline that extracts data from multiple sources, masks sensitive information, and loads it into Google Storage and Google BigQuery. Designed for environments where Airflow is unavailable. …

    Python 1

  4. icecream icecream Public

    Exploratory analysis and linear regression to relate the rating of an ice cream with its ingredients and flavors. The model was also deployed on Render so users could test how different ingredients…

    Jupyter Notebook 3

  5. crud-api crud-api Public

    Web application to perform asynchronous CRUD operations on a PostgreSQL database using FastAPI, SQLAlchemy, Pydantic, Streamlit and Docker.

    Python 1

  6. Sonar-Mines-vs-Rocks Sonar-Mines-vs-Rocks Public

    This is the dataset used in a study on sonar signal classification. The objective was to train a machine learning classification model to differentiate sonar signals between a metal cylinder and a …

    Jupyter Notebook 1