Yeah... you can make this work.
First, if your setup is relatively small, then you won't need Hadoop.
Second, having lots of kinds of actions is a very reasonable thing to have.
My own suggestion is that you analyze these each for their predictive
power independently and then combine them at recommendation time.
My own suggestion for how to deploy the recommendation model is in the form
of a search engine that has fields for each kind of recommendation cue that
you need to have. You can combine any or all of these cues in the process
of doing a non-textual search using the recent history of the user as the
This search-abuse style of recommendations is pretty easy to deploy and PHP
has a reasonably good package for sending queries to Solr, which is the
search engine I tend to recommend.
You should also make a provision for A/B testing on different
recommendation approaches and combinations of inputs. This is pretty
straightforward, but usually requires some sort of experimental condition
assignment and definitely requires good log recording and analysis.
That said, this isn't a tiny project. It involves quite a bit of work. It
isn't terribly hard at any point and the overall architecture is pretty
straightforward, but there is a good bit of work to be done.
On Sun, Feb 17, 2013 at 4:21 PM, Douglass Davis
> I don't have any prior experience with Hadoop. I am also not a statistics
> expert. I am a software engineer, however, after looking at the docs,
> Hadoop still seems pretty intimidating to set up.
> I am interested in doing product recommendations. However, I want to
> store many things about user behavior, for example whether they click on a
> link in an email, how they rate a product, whether they buy it, etc. Then
> I would like to come up with similar items that a user may like. I have
> seen an example just based on user ratings, but would like to add much more
> Also, I think the clustering could be used in terms of recommending based
> on similar descriptions, attributes, and keywords.
> Or, I could use a combination of the two approaches.
> Another question, I wonder if Hadoop takes into account the passage of
> time. For example, a user may rate something high, then change their
> rating a couple months later.
> Lastly, my site is based on PHP. I need to be able to integrate that with
> How feasible is this approach? I saw a clustering example, and a
> recommendation example based on user ratings. Are there any other advice,
> docs, or examples that you could point me to that deals with any of these