Several novel attention-based multi-modal encoder-decoder models are proposed and empirically evaluated to forecast the sales for a new product purely based on product images, any available product attributes and also external factors like holidays, events, weather, and discount.
Trend driven retail industries such as fashion, launch substantial new products every season. In such a scenario, an accurate demand forecast for these newly launched products is vital for efficient downstream supply chain planning like assortment planning and stock allocation. While classical time-series forecasting algorithms can be used for existing products to forecast the sales, new products do not have any historical time-series data to base the forecast on. In this paper, we propose and empirically evaluate several novel attention-based multi-modal encoder-decoder models to forecast the sales for a new product purely based on product images, any available product attributes and also external factors like holidays, events, weather, and discount. We experimentally validate our approaches on a large fashion dataset and report the improvements in achieved accuracy and enhanced model interpretability as compared to existing k-nearest neighbor based baseline approaches.
Surya Sajja
1 papers
Satyam Dwivedi
1 papers
V. Raykar
1 papers