Skip to content

qphm/memento-mori

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

64 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ETL: Global Suicide Analysis

Credit: Kristina Tripkovic

Introduction

According to the World Health Organization:

"Close to 800,000 people die due to suicide every year, which is one person every 40 seconds. Suicide is a global phenomenon and occurs throughout the lifespan. Effective and evidence-based interventions can be implemented at population, sub-population and individual levels to prevent suicide and suicide attempts. There are indications that for each adult who died by suicide there may have been more than 20 others attempting suicide."

Prerequisites

Libraries you will need

  • pandas
  • numpy

Pipeline

  1. Download data from "Data Sources"

  2. Transformation

    • Jupyter Notebook`

      1. Import Original CSVs
      2. Filter columns ("Country", "Freedom Rank", "Freedom Score", "Suicide Rate per 100k", "Happiness Rank", "Happiness Score", "Fifa Score", "Fifa Total Points")
      3. Sort Null Value rows onto a different dataframe
      4. Fix duplicate country names with different spelling and combine rows
      5. Concatenate above data set with original data set so that both only contain full rows and exclude countries with incomplete information
      6. Export to CSV
    • Postgres

      1. Import Original CSVs
      2. Utilize Full Outer Join to identify inconsistency in country names
      3. Modify identified country names
      4. Store country - output/CleanFifa.csv, output/CleanSuicide.csv, output/CleanHappiness.csv, output/CleanFreedom.csv
  3. Load

    • Schema - sql/schema.sql
    • Data - sql/queries.sql
      • FIFA data
      • Freedom data
      • Happiness data

Data Sources

Human Freedom Index (.csv)

FIFA World Rankings (.csv)

World Happiness Report (.csv)

World Health Organization (.csv)

About

An ETL pipeline with PostgreSQL and Python

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published