Open Source collaborative filtering frameworks

collaborative-filtering

I was wondering if there exists any open source frameworks that will help me include the following type of functionality to my website:

1) If I am viewing a particular product, I would like to see what other products may be interesting to me. This information may be deduced by calculating for example what other people in my region (or any other characteristic of my profile) bought in addition to the product that I am viewing. Kind of like what Amazon.com does.

2) Deduce relationships between people based on their profile, interaction with one another on the website (via commenting on one anotherĀ“s posts for example), use of the website in terms of areas most navigated, products bought in common etc.

I am not looking for a open source website with this functionality, but something like an object model into which I can feed information about users and their use of the site including rules about relationships and then at a later point ask it questions described in (1) and (2) above.

Any pointers to white papers / general information about best approaches to do this, or any related links will really help too.

Best Solution

(I am the developer of Taste, which is now part of Apache Mahout)

1) You're really asking for two things here: a) Recommend items I might like b) Favor items that are similar to the thing I am currently looking at.

Indeed, Mahout Taste is all about answering a). Everything it does supports systems like this. Take a look at the documentation to get started, and ask any questions to mahout-user@apache.org.

For 1b) in particular, Mahout has two answers:

If you are only interested in what items are similar to the current item, you would be interested in the ItemSimilarity abstraction in Mahout (org.apache.mahout.cf.taste.similarity.ItemSimilarity) and its implementations, like PearsonCorrelationSimilarity. Based on a set of user-item ratings, this could tell you an estimated similarity between any two items. You'd then just pick the most similar items. In fact, look at the TopItems class in Mahout which can just figure this for you quickly.

But also, you can combine a) and b) by computing recommendations, then applying a Rescorer implementation which then favors items that are similar to the currently-viewed item.

2) Yes likewise, you would be interesting the UserSimilarity abstraction, implementations, etc. This would deduce similarities based on item ratings. Mahout however does not help you deduce these ratings by, say, looking at user behavior. This is domain-specific and up to you.

Sound confusing -- read the docs and feel free to follow up on mahout-user@apache.org where I can tell you more.

Related Question