This project focuses on predicting employee attrition using machine learning and estimating the potential financial loss associated with it.
The project is divided into three main parts:
- Attrition Prediction (Classification): This involves building a classification model to predict whether an employee is likely to leave the company.
- Future Salary Prediction (Regression): This involves building a regression model to predict the future salaries of employees likely to stay in the company.
- Estimated Loss Calculation: This part estimates the potential financial loss associated with employee attrition.
The project uses the "IBM HR dataset" containing information about employee demographics, job satisfaction, performance, and attrition status.
Attrition Prediction (Classification):
- A Voting Classifier ensemble is trained on preprocessed data to predict attrition.
Future Salary Prediction (Regression):
- A Voting Regressor ensemble is trained on preprocessed data to predict future salaries of employees likely to stay.
Estimated Loss Calculation:
- The expected loss for each employee is calculated and aggregated to estimate the total financial impact of attrition.
- pandas
- numpy
- scikit-learn (including SMOTE, GridSearchCV, and various models)
- xgboost
- imblearn
- seaborn
- matplotlib
- Amancharla Anirudh
- Varshneya Kolla
This project demonstrates the application of machine learning for predicting employee attrition and estimating the associated financial loss. It can be valuable for businesses looking to identify employees at risk and plan accordingly.
- Upload the Dataset
- Import Libraries
- Load the Dataset
- Run the Code Execute the rest of the code in the notebook, including the data preprocessing, model training, evaluation, and estimated loss calculation steps. ( Use google colab or jupyter notebook)