Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Practical Upper Limit on Number of Version Stored?


Copy link to this message
-
Re: Practical Upper Limit on Number of Version Stored?
You want the last n events?
Column name is (Epoch - timestamp)+event name or something
Then just return up to n columns
The events are in reverse order.
Sent from a remote device. Please excuse any typos...

Mike Segel

> On Dec 5, 2013, at 7:27 PM, "Shawn Hermans" <[EMAIL PROTECTED]> wrote:
>
> I guess I don't really understand why I wouldn't want to do this.  For our use case we only really care about the user's last 50 to 200 events.  We don't really care about deleting events explicitly.  More than likely we would enable a TTL to get rid of events older than a certain time.  
>
>
>
>
> I guess my question is whether or not there is an issue with storing this many versions.  Are there any measurable drawbacks?  
>
> —
> Sent from Mailbox for iPhone
>
> On Thu, Dec 5, 2013 at 7:11 PM, Michael Segel <[EMAIL PROTECTED]>
> wrote:
>
>> You really don't want to do this.
>> Its not what the versioning was meant for and it has a couple of serious flaws.
>> The biggest flaw... what happens when you want to delete a version? ...
>> There are other options... depending on your use case and how you use the events.
>> Truly using versioning beyond versions of the same data.. not a good idea.
>>> On Dec 5, 2013, at 4:47 PM, Shawn Hermans <[EMAIL PROTECTED]> wrote:
>>> All,
>>> I am working on an HBase application where we store user events in an HBase
>>> table.  The row key is the a user identifier and each column is an event
>>> identifier.  Most users only have a handful of events (10 or less), but
>>> some users have a few hundred thousand events or more and this causes
>>> issues when an HBase client tries to retrieve all those events.
>>>
>>> We are looking at different ways of limiting then number events returned.
>>> One idea is to store each event using its own column qualifier, but
>>> instead use HBase's versioning capability to store the last 100 to 200
>>> events. It doesn't seem like we would run into issues with this approach,
>>> but I want to see if anyone has had any practical experience in this area.
>>> The advice given in http://hbase.apache.org/book/schema.versions.html is a
>>> little ambiguous.
>>>
>>> Thanks,
>>> Shawn
>> The opinions expressed here are mine, while they may reflect a cognitive thought, that is purely accidental.
>> Use at your own risk.
>> Michael Segel
>> michael_segel (AT) hotmail.com