-Re: Purpose of versions in HBase...
Michael Segel 2013-12-09, 23:17
I believe there's a bit more to it...
Which is why I am asking.
As to #3... What happens to a column when you put a tombstone marker on it?
On Dec 9, 2013, at 11:56 AM, Sergey Shelukhin <[EMAIL PROTECTED]> wrote:
> I suspect the honest answer would be "because BigTable paper had it" :P
> There are several aspects to cell versioning (I may be missing some).
> First (not the most important), due to the way HBase stores things
> (write-once files), it comes very cheaply - very little runtime cost, and
> not so much code needs to be written to have it.
> Second, internally, versioning allows for snapshot isolation (within a
> server) to work - with multiple versions present, scanners can read all
> ones to get a consistent view (that's MVCC).
> Third, user-visible, timestamp-based cell versioning is there so that users
> could control the order of things (e.g. delete all cells before...), either
> thru fabricated timestamps, or using external timestamps, e.g. from
> external logs. In fact, with current HBase implementation of auto-ts (there
> are JIRAs to fix it), that's the only "bulletproof" way to use HBase;
> internal HBase versioning relies on server clocks, which is fraught with
> peril (granted, most systems will rarely hit this problems, and may be ok
> with some reordering anyway).
> Fourth, multi-versions as such could be used for some application specific
> scenarios, Percolator paper is a good example.
> On Sun, Dec 8, 2013 at 9:35 AM, Michael Segel <[EMAIL PROTECTED]>wrote:
>> In a different thread, we were discussing good and better schema designs.
>> In order to really understand why one should or should not do something,
>> its kind of important to understand the underlying reasons why HBase was
>> designed the way it was.
>> So since we have a bunch of committers here, and cc'ing the Dev list,
>> I'd like to explore why does HBase have cell versioning. What's its
>> purpose. How is it implemented. and Why.
>> This may seem a bit esoteric, but it would go a long way in educating many
>> of the users on the hbase mailing list.
>> Also it may be a good couple of paragraphs to add to the online
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
The opinions expressed here are mine, while they may reflect a cognitive thought, that is purely accidental.
Use at your own risk.
michael_segel (AT) hotmail.com