Your read queries seem to be more driven form the 'action' and 'object'
perspective, rather than user.
1- So one option is that you make a composite key with action and object:
action|object and the columns are users who are generating events on this
combination. You can scan using prefix filter if you want to look at data
specific set of action and object i.e. your requirements 1, 3 & 4. Key
distribution should be OK too. The drawbacks here are that a) you can end
up with really wide rows b) what if you want to store more information than
just user id in the columns?
The friends part is not that trivial and you have to maintain that
relationship out of this main table or create complex composite entities (I
need to think about it more, HBase is not a graph database.)
On Thu, Sep 5, 2013 at 1:16 AM, Marcos Sousa
> I'm working with HBase since the last 3 moths, now I have to store user
> actions, at first look, using Hbase.
> I have a limited number of actions, thousands of objects and about 50
> million users interacting with them, around 2 billion interactions per
> I have to answer there questions:
> How many users performed action 'foo' at object 'bar'
> What friends performed performed action 'foo' at object 'bar'
> What users made 'foo' at object 'bar' last week.
> What objects received more action 'foo'
> Does anybody have suggestions to a schema for this problem?
> Best regards,
> Marcos Sousa