Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Practical Upper Limit on Number of Version Stored?


Copy link to this message
-
Re: Practical Upper Limit on Number of Version Stored?
You really don't want to do this.
Its not what the versioning was meant for and it has a couple of serious flaws.

The biggest flaw... what happens when you want to delete a version? ...

There are other options... depending on your use case and how you use the events.

Truly using versioning beyond versions of the same data.. not a good idea.

On Dec 5, 2013, at 4:47 PM, Shawn Hermans <[EMAIL PROTECTED]> wrote:

> All,
> I am working on an HBase application where we store user events in an HBase
> table.  The row key is the a user identifier and each column is an event
> identifier.  Most users only have a handful of events (10 or less), but
> some users have a few hundred thousand events or more and this causes
> issues when an HBase client tries to retrieve all those events.
>
> We are looking at different ways of limiting then number events returned.
> One idea is to store each event using its own column qualifier, but
> instead use HBase's versioning capability to store the last 100 to 200
> events. It doesn't seem like we would run into issues with this approach,
> but I want to see if anyone has had any practical experience in this area.
> The advice given in http://hbase.apache.org/book/schema.versions.html is a
> little ambiguous.
>
> Thanks,
> Shawn

The opinions expressed here are mine, while they may reflect a cognitive thought, that is purely accidental.
Use at your own risk.
Michael Segel
michael_segel (AT) hotmail.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB