Skip to content

Code and Explanation of our approach for the Data Science Event, Cassandra

License

Notifications You must be signed in to change notification settings

Terabyte17/Cassandra-Udyam-Defaulter-Prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Cassandra Udyam Defaulter Prediction

This repository contains the code and the explanation of our approach for building a Machine Learning model capable of predicting loan defaulters for a bank. This was the problem statement of the event Cassandra, a data science event of Udyam, the annual technical fest of the Electronics Engineering Society of IIT-BHU. With this we were able to secure the 2nd position.

Our Approach in Brief

On extensive analysis of the data, we found several key attributes in it. This included temporal consistency in the last_update column, relations between last_update and recent_payment_activity columns and the imbalance of labels in the dataset to name a few. Data cleaning and feature engineering were applied before feature aggregation and merging of the 2 datasets. This was followed by splitting the dataset via StratifiedKFold and applying SMOTE to the training dataset. We used the ROC-AUC-Score to validate our models and the Optuna Framework for Hyperparameter Tuning. We used an ensemble of a Decision Tree Classifier and an Adaboost Classifer as our model.


Feature Aggregation

Team Members ✨


Yash Sahijwani


Somnath Sendhil Kumar


Vikhyath Venkatraman

About

Code and Explanation of our approach for the Data Science Event, Cassandra

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published