Skip to content

b-tanyileke/loan_default_prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

2 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Loan Default Prediction

Predicting loan default using machine learning models.
This project was developed as part of a data science challenge by Coursera Project Network .


๐Ÿ“Š Project Overview

Loan default prediction is a critical task for financial institutions, helping them minimize losses and manage risk.
This project applies both traditional machine learning methods to predict loan defaults on a large, highly imbalanced dataset.


๐Ÿ› ๏ธ Methods Used

  • Exploratory Data Analysis (EDA):

    • Distribution checks
    • Class imbalance visualization
    • Correlation heatmaps
  • Data Preprocessing:

    • Handling missing values
    • Encoding categorical features
    • Standard scaling numerical features
    • Train-test split with stratification
  • Models Implemented:

    • Logistic Regression (balanced weights)
    • Random Forest
    • Gradient Boosting
    • LightGBM (with scale_pos_weight)
  • Evaluation Metrics:

    • ROC-AUC Score (main metric)
    • Confusion Matrices

๐Ÿ“ˆ Results

Model ROC-AUC
Logistic Regression 0.69
Random Forest 0.68
Decision Tree 0.67
LightGBMt 0.76
  • LightGBM achieved the best performance.
  • Handling class imbalance (via class_weight and pos_weight) significantly improved ROC-AUC.

LightGBM Feature Importance


Install Dependencies

git clone https://github.com/b-tanyileke/deeplearn_pipeline_optimizer.git
pip install -r requirements.txt

About

Loan default prediction notebook using traditional machine learning models and LightGBM. Tackling imbalanced financial data and evaluating performance with ROC-AUC.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors