Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - Why does a delete behave like this?


Copy link to this message
-
Re: Why does a delete behave like this?
Stack 2013-12-09, 21:30
On Mon, Dec 9, 2013 at 4:47 PM, Niels Basjes <[EMAIL PROTECTED]> wrote:

>
> Why has it been designed/implemented like this?
> What is the logic behind this model?
>

Hey Niels:

It is probably fair to call this an instance of implementation leaking and
polluted our data model.  We should fix it.

Currently, deletes always sort before all other types when all other
coordinates are the same (same row, same column family, same timestamp,
etc.)  IIRC, it was done this way along time ago because it made delete
reasoning 'easier'.  This forced sort ordering is why you see the behavior
you note in your shell experiments.

Our Sergey recently has suggested we undo our factoring in 'type' when
sorting KeyValues/Cells; rather, we would distinguish pivoting on sequence
id when all else matches.  Awkwardly, we'd then have to let user add
sequence id when querying a specific Cell.  This would not be easy to do.
 Sequence id is an internal, amorphous notion at the moment -- it exists
while KeyValues are in flight but is (mostly) dropped after KeyValues
persist to hfiles -- but it looks like it is fast becoming more tangible
given some issues that arise around WAL replay at recovery time and in
corner cases replicating.

What is your thinking on this Niels?  Its current implementation interrupts
your ability building an app on hbase?

Thanks,
St.Ack