Skip to content

janmejoykar1807/Python_Data_Mining_Projects

Repository files navigation

Python Data Mining Projects

A collection of data mining and machine learning projects implemented in Python using Jupyter Notebooks. Each project applies core data science techniques — from regression and classification to clustering — on real-world datasets.


📁 Repository Structure

File Dataset Techniques Used
Data_mining_pj1(Utilities).ipynb Utilities Clustering (K-Means), Exploratory Data Analysis
Data_mining_pj2(Airfares).ipynb Airfares Linear Regression, Feature Selection
Data_mining_pj3(Baseball_Hitters).ipynb Baseball Hitters Regression, LASSO/Ridge, Model Evaluation
Data_mining_pj4(SpamBase).ipynb SpamBase Classification, Logistic Regression, Naive Bayes

📊 Project Summaries

Project 1 — Utilities (Clustering)

Explores utility company data to identify groups of similar companies using unsupervised learning. Applies K-Means clustering and visualizes cluster characteristics.

Dataset: Utilities.csv

Project 2 — Airfares (Regression)

Analyzes domestic airfare pricing to identify key factors that drive ticket costs. Builds regression models to predict fare prices across routes.

Dataset: Airfares.csv

Project 3 — Baseball Hitters (Regression & Regularization)

Predicts baseball player salaries using batting statistics. Implements and compares multiple regression approaches including regularization methods to handle multicollinearity.

Dataset: Hitters.csv

Project 4 — SpamBase (Classification)

Builds a spam email classifier using features extracted from email content. Compares classification models for accuracy, precision, and recall.

Dataset: Spambase.csv


🛠️ Technologies & Libraries

  • Python 3.x
  • Jupyter Notebook
  • pandas, numpy
  • scikit-learn
  • matplotlib, seaborn

🚀 Getting Started

  1. Clone the repository:

    git clone https://github.com/janmejoykar1807/Python_Data_Mining_Projects.git
  2. Install dependencies:

    pip install pandas numpy scikit-learn matplotlib seaborn jupyter
  3. Launch Jupyter Notebook:

    jupyter notebook
  4. Open any .ipynb file to explore the project.


👤 Author

Janmejoy Kar Data Science learner — applying Python, R, and SQL for data analysis and predictive modeling. GitHub Profile

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages