The accuracy of this kind of recommender tends to be very low. Although we recommended ten coupons for every user, many of them likely did not purchase a full ten during the test period. In addition, there are a very large number of possible coupons, often looking very similar across many attributes of our given data.
Figure 1. Coupon popularity accuracy on various classifiers
After building our model and recommending coupons,
Since the Cosine Similarity got a much higher accuracy than our baseline, we were pleased with the results. It was helpful that we could see how the accuracies improved every time we included additional data or methods in our approach.
If we continued to work on this project, we would like to try some feature engineering on the available data. We would also like to be able to try out different algorithms such as Multi layered Neural Network, a Gradient Boosted Decision Tree etc. Unfortunately, since the dataset is so large, if we had added many more attributes to it the algorithms would have taken much longer to run. Most of the top scorers on Kaggle reported run times of between 24 hours to days, on very high end servers. This also allowed us less flexibility in trying out different approaches. Looking at all the challenges faced, we now know why this was a competition on Kaggle. But it was indeed an interesting challenge, one from which we gained a lot of knowledge