Original Source Here
Recommender system which favors newer things
I this article I am making a proceeding on recommender system. In the recommender system time is now being included, which wasn’t in my last article “Developing a recommender system”. Now, items which were bought more recently, are more frequently recommended. It should be mentioned that similarity between items is still included in the system, so including time would be sort of an upgrade.
The structure of recommending function is slightly different then one in my last article about recommender systems. So then, let’s start explaining the procedure.
Let’s say that there is a list of items that buyer bought, with recorded time of purchase for every item. Something like this.
For every buyer we have their buying sequence record stored.
The new similarity measure we need is the similarity of sequences in which time is included. In it, procedure of recommending goes like this. First, you select the buyer you want recommend a product to. You take his sequence of purchased items and find the most time similar sequence to it. From the other sequence take the product most recently bought, but which the buyer you are recommending to didn’t already buy.
General formula for timed sequence similarity goes like this:
timed sequence similarity = timed similarity between items in sequence / sequence disparity
In the timed similarity between items in sequence I want to include time. So the rule I am going to make is this: it adds to items similarity if the two items were bought at the more similar time. Also, I want to favor items that were more recently bought. In order to do this I will be increasing timed similarity between two sequences if the items from the sequences are more recently bought. It is cheating in terms of similarity in a way, but it just might do the trick.
So the formula for timed similarity between items in sequence is this:
(1) Sum for every combination of two items in different sequences this: similarity between two items * similarity between times these items were bought * arc tan (a*(time of purchase of item1 + time of purchase of item2))
Factor a is here for weighting and should be experimentally adjusted.
All these three expressions from (1) have bound of [0, x], so normalization for them is not needed. But, normalization for the number of combinations of items is needed as for every two sequences there will potentially be different number of item combinations. So, divide the expression (1) with the number of combinations of items.
Furthermore, additional similarity factor might be added, which might not be necessary. That is sequence disparity. It is basically a difference of length between the two sequences. So, divide the expression you have now with the size of the shorter sequence / size of the longer sequence. This says that two sequences are more similar if their size is more similar. It actually might have a negative effect and should be excluded.
You might have to look at my previous article on recommender systems “Developing a recommender system” to find out how the similarity between two items function is implemented.
Implementation of this extended recommender system is in plan.
Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot