Skip to content
This repository has been archived by the owner on Nov 14, 2021. It is now read-only.

Scripts developed for the "Knowledge Extraction and Machine Learning" (ECAC) class "To Loan or Not To Loan" data mining case study / Kaggle competition.

Notifications You must be signed in to change notification settings

Eduardo-LP-Silva/To-Loan-or-Not-To-Loan

Repository files navigation

ECAC-Competition

Scripts developed for the "Knowledge Extraction and Machine Learning" (ECAC) class "To Loan or Not To Loan" data mining case study / Kaggle competition:

  • data_understanding.py - Contains methods to analyse the multiple data sources and provide visual representations of relevant patterns, as well as to calculate various statistics regarding certain attributes.
  • data_preparation.py - Contains methods to pre-process the data, including filling missing values with the previously calculated statistics, removing correlated attributes and outliers, one-hot encoding categorical features and normalizing the data.
  • k_nn.py / rf.py / svm.py - Contains the methods associated with each algorithm (K-NN, random forest, SVM) to split and balance the data, perform hyper-parameter optimizations and perform the final class predictions.

About

Scripts developed for the "Knowledge Extraction and Machine Learning" (ECAC) class "To Loan or Not To Loan" data mining case study / Kaggle competition.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages