Data Analyst Nanodegree

Practical Statistics

Project: Analyze A/B Test Results

Currently, we have only one predictor variable. We obtained an aggregate result based on just the page the user receives. There maybe other factors affecting conversion on a lower level than we are analysing right now. Maybe if we had other features then we could decide better about which page drives more conversion.

Now along with testing if the conversion rate changes for different pages, we also add an effect based on which country a user lives. Countries data is in countries.csv.

Here is a summary of the model after considering the location information:

Conclusion

Earlier we chose our alternative hypothesis that the new page drives more conversion. Then our evaluation of p-value suggested that we cannot reject the null hypothesis.

We then used built-in z-test to see the statistical significance. The p-value from z-test also suggested that there is not much evidence to reject the null.

Regression models also showed the same results that we cannot reject the null hypothesis. p-values from regression model were also not below 5% Type I error rate. Maybe if they run their test for longer time or we have large data compared to this, then the results could be different. But as of now, new page does not increase the conversion rate. In our dataset we had some misalignments in rows that further reduced the area of research.

Files

analyse_ab_test_results_notebook.ipynb:

This is the file where I performed the main work of this project including data wrangling and hypothesis testing.

ab_data.csv

Input dataset with five features:

user_id

This is the unique identifier for this data set.

timestamp
group

Can have two values: control or treatment. Control means the old user group who are visting sites since a very long time. They generally see the old landing page. Treatment means the new user group who see the new landing page.

landing_page

It also has two values: old_page & new_page. New page is the one which we think may improve web traffic on website.

converted

It can be 0 or 1. 1 means the user converted and 0 means the user did not convert.

countries.csv

Dataset to include the countries column as another independent variable.

report.html

HTML export of Jupyter Notebook.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
img		img
.gitignore		.gitignore
README.md		README.md
ab_data.csv		ab_data.csv
analyse_ab_test_results_notebook.ipynb		analyse_ab_test_results_notebook.ipynb
countries.csv		countries.csv
report.html		report.html

Event	Probability
Conversion of an individual regardless of the page he receives	0.1196
Conversion of an individual given he was in the control group	0.1024
Conversion of an individual given he was in the treatment group	0.1188
Individual received the new page	0.5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Analyst Nanodegree

Practical Statistics

Project: Analyze A/B Test Results

Table of Contents

Project Overview

Probability

A/B Test

A Regression Approach

Conclusion

Files

About

Releases

Packages

Languages

sanjeevai/hypothesis_testing

Folders and files

Latest commit

History

Repository files navigation

Data Analyst Nanodegree

Practical Statistics

Project: Analyze A/B Test Results

Table of Contents

Project Overview

Probability

A/B Test

A Regression Approach

Conclusion

Files

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages