Skip to content

• A dataset consisting of 150000+ reviews from products on Amazon was used to discover insights into consumer reviews and perform sentiment analysis.

Notifications You must be signed in to change notification settings

Vishihereforcodes/Amazon-Reviews-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Sentiment-analysis-on-amazon-reviews

This project showcases a comparative study between different machine learning models to perform sentiment analysis on the customer reviews of Amazon products in the Electronics category. The primary models we will look into for our analysis are Support Vector Machines, Naive Bayes Classifier, Random Forest Classifier, BERT Model, Stochastic Gradient Descent (SVM Linear) and Linear SVC models.


Dataset Source

Amazon Product Reviews Dataset (2018 Updated Version)
http://jmcauley.ucsd.edu/data/amazon/index_2014.html

Running the project

The original dataset is huge and takes a lot of computational power to process. Here are some samples from the original dataset that can be used to execute the project. Make sure to download/save it to your drive, and change the URLs in the files accordingly.

To run the project

Update the urls in the files and run following steps

  • Data cleaning: Run datacleaning.ipynb with its dataset url pointing to the location where your dataset is stored. Modify the "CLEAN_URL" to the output file as well where the clean dataframe would be written into.
  • Exploratory data analysis: Run EDA.ipynb with the url pointing to the cleaned version of the dataset from above to view the different visualisations from the dataset.
  • Model Building and training: Inside the models/ folder, run the models.ipynb to compare and train different models we build, like Support Vector Machines, Naive Bayes Classifier, Random Forest Classifier, Stochastic Gradient Descent (SVM Linear) etc, and view the accuracies and compare them. To run BERT model, run the BERT.ipynb with the url pointing to the cleaned dataset

About

• A dataset consisting of 150000+ reviews from products on Amazon was used to discover insights into consumer reviews and perform sentiment analysis.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published