Udacity Arvato Identify Customer Segments

Capstone Project

Problem Statement

Consider a company's marketing campaign (Arvato Financial Services), in which we need to select those individuals who can become the company's future customers. For this task, we have the following databases: demographic information from Germany (country where the company is located) and information from individuals who are already customers of this company.
First, the demographic information of the German population was analyzed in order to understand and explore the main characteristics of this population.
Then, we create a predictive model that can determine with reasonable accuracy whether a person can become a possible consumer of the company, when subjected to a certain marketing campaign.
Finally, we classify each possible consumer, from an unexplored test database, and submit the result on the kaggle platform.

Project Motivation

The project is a problem for a company, with real data and with several possible approaches. It is a rich set of data and an interesting problem to be solved. Submitting work on Kaggle is a way to compare the quality of our algorithm with of other students algorithms. That's why I chose to do this specific project that motivated me to learn even more.

Instalation

The following packages are necessary: numpy , datetime, pandas , matplotlib, seaborn , math, sklearn , pylab ,itertools, imblearn, pickle, xgboost

Files

project.pdf - Report with detailed explanation of the entire project.
capstone_proposal.pdf - Report with a proposal for thus project.
util.py - python module with basically data processing and feature engineering
cluster.py -- python module with clustering methods for segmentation report
pca.py -- python module with pca methods for dimensionality reduction
Udacity_AZDIAS_052018.csv: Demographics data for the general population of Germany; 891 211 persons (rows) x 366 features (columns);
Udacity_CUSTOMERS_052018.csv: Demographics data for customers of a mail-order company; 191 652 persons (rows) x 369 features (columns);
Udacity_MAILOUT_052018_TRAIN.csv: Demographics data for individuals who were targets of a marketing campaign; 42 982 persons (rows) x 367 (columns);
Udacity_MAILOUT_052018_TEST.csv: Demographics data for individuals who were targets of a marketing campaign; 42 833 persons (rows) x 366 (columns);
unknown_values.csv: Mapping dictionary with attributes and the value of the unkown value

Results and Conclusions

The result of this work can be found in the file final_project.pdf, as well as any details of implementation, conclusions and future work

Licenses and Acknowledgements

The project is part of Udacity's machine learning nanodegree program. The data provided is not public, and belongs to Arvato and Udacity

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
README.md		README.md
cluster.py		cluster.py
final_project.pdf		final_project.pdf
pca.py		pca.py
project.ipynb		project.ipynb
proposal.pdf		proposal.pdf
scores.csv		scores.csv
util.py		util.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Udacity Arvato Identify Customer Segments

Capstone Project

Table of Contents

Problem Statement

Project Motivation

Instalation

Files

Results and Conclusions

Licenses and Acknowledgements

About

Releases

Packages

Languages

vgp314/Udacity-Arvato-Identify-Customer-Segments

Folders and files

Latest commit

History

Repository files navigation

Udacity Arvato Identify Customer Segments

Capstone Project

Table of Contents

Problem Statement

Project Motivation

Instalation

Files

Results and Conclusions

Licenses and Acknowledgements

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages