Machine Learning for Churn Prediction

Churn prediction is crucial for businesses to identify potential customers who are likely to discontinue using their service. This repository contains my project aiming to tackle this issue.

🚀 Click Here to Access the App! 🌟

Project Overview

Objectives: To identify potential churn customers and understand the associated patterns, thus enabling businesses to take proactive measures to retain them.

Data: The data is avaible here and the data dictionary here

Structure: The project is in 3 parts:

Extract, Transform and load (ETL) and Exploratory Data Analysis (EDA)
Creating, Selecting and Optimizing models
Creating and Streamlit app to deploy our model

1 - Extract, Transform and Load (ETL) and Exploratory Data Analysis (EDA)

ETL

The dataset, in JSON format, is imported into Python and undergoes transformation and cleaning. After processing, the data is saved to a csv file.

EDA

Post-ETL, various visualizations are generated to delve deeper into the patterns within the data, identify potential problems, and better understand the overall structure of the dataset.

Challenges faced:

Handling missing values
Data enconding
Correcting data types,
Ploting relevants graphs for analysis
Ceation of functions in a python file (helper.py) for a more clean notebook.

Detailed documentation of the ETL and EDA processes can be found in this notebook, which covers data cleaning, handling missing values, data exploration, and preliminary analysis.

2 - Creating, Selecting and Optimizing models

The second part of the project is dedicated to:

Process the encoded_churn_data.csv file generated in the first part: "1 - Extract, Transform and Load" to create and compare models"
- Creating new columns
- Scaling data
- Balacing data
Creating Baseline Models
- Creating 9 models (Decision Tree Regressor, Random Forest Regressor, Logistic Regression, KNeighborsClassifier, SVC, GradientBoostingClassifier, GaussianNB, AdaBoostClassifier and MLPClassifier)
Select best model basead on choosen metric
Optimize best models Hyperparameters and access its results
- Usinig nested Cross validation to peform hyperparameter tunning and model assessment (You can check my in depth notebook on Nested Cross Validation)
Save model

All activities performed are documented in this notebook.

3 - Streamlit app and Model deployment

To deploy the model we created a streamlit app that allows users to interact with it through an easy interface. This includes:

Model selection
Data insertion
Prediction of Churn probability
Exploratory Data Analysis tab with visualizations

Streamlit app

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
.streamlit		.streamlit
Model		Model
data		data
pages		pages
.gitignore		.gitignore
Churn_Data_Cleansing_and_Exploration.ipynb		Churn_Data_Cleansing_and_Exploration.ipynb
Model.py		Model.py
ModelCreation_Selection_Optimization.ipynb		ModelCreation_Selection_Optimization.ipynb
README.md		README.md
helpers.py		helpers.py
helpers_2.py		helpers_2.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Machine Learning for Churn Prediction

Project Overview

1 - Extract, Transform and Load (ETL) and Exploratory Data Analysis (EDA)

2 - Creating, Selecting and Optimizing models

3 - Streamlit app and Model deployment

About

Releases

Packages

Languages

Lacerdash/ML-for-Churn-predicting

Folders and files

Latest commit

History

Repository files navigation

Machine Learning for Churn Prediction

Project Overview

1 - Extract, Transform and Load (ETL) and Exploratory Data Analysis (EDA)

2 - Creating, Selecting and Optimizing models

3 - Streamlit app and Model deployment

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages