Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> Purpose of versions in HBase...


Copy link to this message
-
Re: Purpose of versions in HBase...
I believe there's a bit more to it...

Which is why I am asking.

As to #3... What happens to a column when you put a tombstone marker on it?

On Dec 9, 2013, at 11:56 AM, Sergey Shelukhin <[EMAIL PROTECTED]> wrote:

> I suspect the honest answer would be "because BigTable paper had it" :P
>
> There are several aspects to cell versioning (I may be missing some).
> First (not the most important), due to the way HBase stores things
> (write-once files), it comes very cheaply - very little runtime cost, and
> not so much code needs to be written to have it.
> Second, internally, versioning allows for snapshot isolation (within a
> server) to work - with multiple versions present, scanners can read all
> ones to get a consistent view (that's MVCC).
> Third, user-visible, timestamp-based cell versioning is there so that users
> could control the order of things (e.g. delete all cells before...), either
> thru fabricated timestamps, or using external timestamps, e.g. from
> external logs. In fact, with current HBase implementation of auto-ts (there
> are JIRAs to fix it), that's the only "bulletproof" way to use HBase;
> internal HBase versioning relies on server clocks, which is fraught with
> peril (granted, most systems will rarely hit this problems, and may be ok
> with some reordering anyway).
> Fourth, multi-versions as such could be used for some application specific
> scenarios, Percolator paper is a good example.
>
>
>
> On Sun, Dec 8, 2013 at 9:35 AM, Michael Segel <[EMAIL PROTECTED]>wrote:
>
>>
>> Hi,
>>
>> In a different thread, we were discussing good and better schema designs.
>> In order to really understand why one should or should not do something,
>> its kind of important to understand the underlying reasons why HBase was
>> designed the way it was.
>>
>> So since we have a bunch of committers here, and cc'ing the Dev list,
>>
>> I'd like to explore why does HBase have cell versioning. What's its
>> purpose.  How is it implemented. and Why.
>>
>> This may seem a bit esoteric, but it would go a long way in educating many
>> of the users on the hbase mailing list.
>>
>> Also it may be a good couple of paragraphs to add to the online
>> reference...
>>
>> -Mike
>>
>>
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.

The opinions expressed here are mine, while they may reflect a cognitive thought, that is purely accidental.
Use at your own risk.
Michael Segel
michael_segel (AT) hotmail.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB