Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> commit semantics


Copy link to this message
-
RE: commit semantics
Ok cool. Thanks for clarifying.

I think what I had in mind was a hybrid-- basically try to accumulate transactions up to a certain app configurable time window before sync'ing (& until the sync delay ack'ing the client).

Just caught up on a earlier response from Joy on this as well.

<<if the performance with setting of 1 doesn't work out - we may need an option to delay acks until actual syncs .. (most likely we would be able to compromise on latency to get higher throughput - but wouldn't be willing to compromise on data integrity)>>

Yes, that's what I had in mind. Agree that this could be something we explore later if necessary.

Regards,
Kannan
-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Jean-Daniel Cryans
Sent: Tuesday, January 12, 2010 11:44 AM
To: hbase[EMAIL PROTECTED]
Subject: Re: commit semantics

On Tue, Jan 12, 2010 at 11:29 AM, Kannan Muthukkaruppan
<[EMAIL PROTECTED]> wrote:
>
> For data integrity, going with group commits (batch commits) seems like a good option. My understanding of group commits as implemented in 0.21 is as follows:
>
> *         We wait on acknowledging back to the client until the transaction has been synced to HDFS.

Yes

>
> *         Syncs are batched-a sync is called if the queue has enough transactions  or if a timer expires. (I would imagine that both the # of transactions to batch up as well as timer are configurable knobs already)? In this mode, for the client, the latency increase on writes is upper bounded by the timer setting + the cost of sync itself.

Nope. There is two kinds of group commit around that piece of code:

1) What you called batch commit, which is a configurable value
(flushlogentries) that we have to append x amount of entries to
trigger a sync. Clients don't hold until that syncs happens so a
region server failure could lose some rows depending on the time
between the last sync and the failure.

If flushlogentries=100 and 99 entries are lying around for more than
the timer's timeout (default 1 sec), the timer will force sync those
entries.

2) Group commit happens at high concurrency and is only useful if a
high number of clients are writing at the same time and that
flushlogentries=1. What happens in the LogSyncer thread is that
instead of calling sync() for every entry, we "group" the clients
waiting on the previous sync and issue only 1 sync for all of them. In
this case, when the call returns in the client, we are sure that the
value is in HDFS.

>
>
>
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of stack
> Sent: Tuesday, January 12, 2010 10:52 AM
> To: hbase[EMAIL PROTECTED]
> Cc: Kannan Muthukkaruppan; Dhruba Borthakur
> Subject: Re: commit semantics
>
> On Tue, Jan 12, 2010 at 10:14 AM, Dhruba Borthakur <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
> Hi stack,
>
> I was meaning "what if the application inserted the same record into two
> Hbase instances"? Of course, now the onus is on the appl to keep both of
> them in sync and recover from any inconsistencies between them.
>
> Ok.  Like your  "Overlapping Clusters for HA" from http://www.borthakur.com/ftp/hdfs_high_availability.pdf?
>
> I'm not sure how the application could return after writing one cluster without waiting on the second to complete as you suggest above.  It could write in parallel but the second thread might not complete for myriad reasons.  What then?  And as you say, reading, the client would have to make reconciliation.
>
> Isn't there already a 'scalable database' that gives you this headache for free without your having to do work on your part (smile)?
>
> Do you think there a problem syncing on every write (with some batching of writes happening when high-concurrency) or, if that too slow for your needs, adding the holding of clients until sync happens as joydeep suggests?  Will that be sufficient data integrity-wise?
>
> St.Ack
>
> Thanks,
> St.Ack
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB