Skip to content

This repo contains deep models used to predict Sales of each store and item.

Notifications You must be signed in to change notification settings

anirudh201098/Store-Item-Demand-Forecasting

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Store-Item-Demand-Forecasting

Mission statement:

A data science project for demand analysis of items in stores. The data is a multiple time series data where we have 500 sets of combinations for stores and items, and we are required to analyse them and forecast their future values.

Dataset: Store Item Demand Forecasting Challenge (Kaggle)

Procedure

We use train.csv which contains 5 years of sales for 50 items in 10 stores, from 2013 to 2017. There are 10 stores and 50 items. Sales are given for each item in each store, i.e. 500 sets of sales, each of 5 years. Values of sale range from about 2 to 50.

Data Visualization

Store-wise and Item-wise sales arranged according to maximum sales

From the figures, we can say that each store and each item has trend and seasonality componenet.

Day-wise and Month-wise and Year-wise sales

From the figures, we can say that the sales increase in each year.

Item-wise and Store-wise sales

Feature selection:

Categorical Embedding

The task of entity embedding is to map discrete values to a multi-dimensional space where values with similar function output are close to each other. After we use entity embeddings to represent all categorical variables, all embedding layers and the input of all continuous variables (if any) are concatenated. The merged layer is treated like a normal input layer in neural networks and other layers can be build on top of it. With entity embedding we want to put similar values of a categorical variable closer to each other in the embedding space. After adding embedding layers for year, week of the day, day of the week and month of year, extracted from date feature. We also added embedding layers for stores and items. Then we concatenated all the embedding layers, which resulted in 62 unique features after eliminating the redundancy.

Train, Test and Validation sets:

We considered 2017 as our test data, and 2013 -2016 as our train data. The train data has 7,30,500 samples, and test data has 1,82,500 samples. For validation, we considered leave 6 out strategy, wherein 6 months is used as validation data and rest of the 42 months is used for training each set of samples. We fine tuned the results considering different 6 months in each year. In total, we have 8 sets for 2013-2016 years.

Deep models

In this category, after concatenating all the embedding layers, we applied Neural Networks, Long Short term Memory (LSTM), Temporal Convolutional Neural Network (TCN), Hybrid model (TCN +LSTM), and LSTM Autoencoder.

Other techniques

Fbprophet

Predictions on Deep models

Predictions of Neural Networks evaluated by R2 score

Predictions of LSTM evaluated by R2 score

Similarly, we have plotted graphs for TCN, Hybrid model and LSTM Auto encoder

Predictions on Other techniques

Predictions of Fbprophet evaluated by R2 score, MAPE, MAE

Weights files

Neural Networks

LSTM

TCN

LSTM Auto Encoder

TCN + LSTM

References

[1] Entity Embeddings of Categorical Variables

[2] https://github.com/philipperemy/keras-tcn

About

This repo contains deep models used to predict Sales of each store and item.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published