Skip to content

Project Documentation

Jalaz Kumar edited this page Aug 6, 2020 · 10 revisions

Directory Structure:

jalaz@jalaz-personal:~/Documents/Github$ tree KitabGhar-Recommend-API/
KitabGhar-Recommend-API/
├── api
│   ├── app.py
│   ├── dataprovider.py
│   ├── dbconnections.py
│   ├── recommender.py
│   └── test_api.py
├── development.env
├── KitabGhar.sql
├── README.md
├── Recommender-NB.ipynb
├── requirements.txt
├── Run-API.md
├── run-tests.sh
└── start.sh

1 directory, 13 files

Project Files

  • development.env : ENV Configuration file for loading dev properties, not a part of git repo. Requires to be added during project build

  • kitabghar.sql : Database dump file for loading schema & raw data for running basic recommendations

  • Recommender-NB.ipynb : IPython notebook file for R&D purposes.

  • requirements.txt : File used for specifying what python packages are required to run the API

  • Run-API.md : File which contains instructions to build & run the API on local machine

  • run-tests.sh : Bash file for running unit tests

  • start.sh : Bash file for starting flask server & running the API

API Files

  • dataprovider.py :
    Comprises of 2 classes which provide separate functionalities.

    • HashingFunctionality
      • getHashUsers : Accesses users table from DB & provides dictionary of users.
      • getHashBooks : Accesses books table from DB & provides dictionary of books.
    • DynamicRecommendations
      • generateHashes : Calls self.HashingFunctionality to generate dictionary of users & books.
      • getDynamicRating : Accesses reviews table from DB & returns a python list ratingData.
      • getDataMatrix : Uses ratingData & numpy to create & returns dataMatrix.
      • getSimilarityMatrix : Uses dataMatrix, numpy & scipy to create & returns itemSimilarityMatrix.
  • dbconnections.py : Loads DB configurations from ENV files & return DB Connection object.

  • recommender.py :
    Comprises of 3 classes which provide 3 supported recommendations

    • PopularityRecommenderSales
      • gatherSalesData : Accesses shelfs table from Db & provides sale count of each book. Returns a python list purchaseBooksCounter.
      • generateRecommendations : Uses purchaseBooksCounter to generate top 5 recommendations & stores in finalRecommendations.
      • getRecommendations : Accessor function exposed to the outer world for finalRecommendations.
    • PopularityRecommenderRatings
      • gatherRatingsData : Accesses reviews table from DB & generates normalisedRatings for each book. Return a python list ratingCalculatorBooks.
      • generateRecommendations : Uses ratingCalculatorBooks to generate top 5 recommendations & stores in finalRecommendations.
      • getRecommendations : Accessor function exposed to the outer world for finalRecommendations.
    • ItemSimilarityRecommender
      • generateHashes : Calls dataprovider.HashingFunctionality to generate dictionary of users & books.
      • gatherPersonalisedData : Accesses shelfs table from DB & retrieves personal readings of user & stores in alreadyBooksRead & alreadyRecommended.
      • generateMatrices : Calls dataprovider.DynamicRecommendations for computing & extracting matrices & stores item-item cosine similarity in itemSimilarityMatrix
      • generateRecommendations : Uses alreadyBooksRead, alreadyRecommended & itemSimilarityMatrix to generate top 18 recommendations & stores in finalRecommendations.
      • getRecommendations : Accessor function exposed to the outer world for finalRecommendations.
  • app.py : Main file which runs the flask server & serves all routes to the outside world.
    Uses 3 separate objects initiated from recommender.py & serves all 3 types of recommendations

    • PopularityRecommendationsSales
    • PopularityRecommendationsRatings
    • ItemSimilarityREcommendations
  • test_api.py : File containing basic unit tests for recommendations end points:

    • test_PopularityRecommendationsSales
    • test_PopularityRecommendationsRatings
    • test_ItemSimilarityREcommendations

Collaborative-Filtering implementation for Item-Item Similarity

Ratings Data extracted for entire platform:

[(jalaz.kumar,PROG1001,3),(sukhbir.singh,NOV1002,5),(jalaz.kumar,NOV1001,1),(narendra.dodwaria,ACAD1001,5),(saurabh.kumar,NOV1003,2)]

Data-Matrix created:

Book-ID/USER-ID PROG1001 ACAD1001 NOV1001 NOV1002 NOV1003
jalaz.kumar 3 1
sukhbir.singh 5
narendra.dodwaria 5
saurabh.kumar 2

Item to Item Similarity-Matrix created:

Book-ID/Book-ID PROG1001 ACAD1001 NOV1001 NOV1002 NOV1003
PROG1001
ACAD1001
NOV1001
NOV1002
NOV1003