- Provide average classification and prediction performance for N independent experimental runs using Adaboost, Gradient Boosting Machine, XGBoost, and Random Forests.
- Investigate hyperparameters for different models - tree-based models - depth of tree
- Compare the results with Simple Neural Networks (Adam/SGD) and Decision Trees.
- Report mean and std of results for a selected classification and regression dataset.
- Regression - Energy data: https://archive.ics.uci.edu/dataset/242/energy+efficiency
- Classification - Pima data: https://www.kaggle.com/kumargh/pimaindiansdiabetescsv
- XGBoost https://xgboost.readthedocs.io/en/stable/install.html
- Sklearn https://scikit-learn.org/stable/install.html
- ensemple.py was written by Rohitash Chandra. Gemini was used to make ensemble-gemini.py and [ensemble-tutorial.ipnyb] (https://github.com/sydney-machine-learning/ensemble-learning-tutorial/blob/master/ensemble-tutorial.ipynb)
- ensemble_tutorial_detailed.md was created by ChatGPT using [ensemble.py]
- ChatGPT Tutorial support: https://chatgpt.com/share/69d4e146-6bcc-8324-a373-63926991da20
- Khan, Azal Ahmad, Omkar Chaudhari, and Rohitash Chandra. "A review of ensemble learning and data augmentation models for class imbalanced problems: Combination, implementation and evaluation." Expert Systems with Applications 244 (2024): 122778.https://www.sciencedirect.com/science/article/pii/S0957417423032803