Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> batch update question


Copy link to this message
-
Re: batch update question

Hi there, if you look in the source code for HTable there is a list of Put
objects.  That's the buffer, and it's a client-side buffer.

On 9/5/12 12:04 PM, "Lin Ma" <[EMAIL PROTECTED]> wrote:

>Thank you Stack for the details directions!
>
>1. You are right, I have not met with any real row contention issues. My
>purpose is understanding the issue in advance, and also from this issue to
>understand HBase generals better;
>2. For the comments from API Url page you referred -- "If
>isAutoFlush<http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client
>/HTableInterface.html#isAutoFlush%28%29>is
>false, the update is buffered until the internal buffer is full.", I
>am
>confused what is the buffer? Buffer at client side or buffer in region
>server? Is there a way to configure its size to hold until flushing?
>3. Why batch could resolve contention on the same raw issue in theory,
>compared to non-batch operation? Besides preparation the solution in my
>mind in advance, I want to learn a bit about why. :-)
>
>regards,
>Lin
>
>On Wed, Sep 5, 2012 at 4:00 AM, Stack <[EMAIL PROTECTED]> wrote:
>
>> On Sun, Sep 2, 2012 at 2:13 AM, Lin Ma <[EMAIL PROTECTED]> wrote:
>> > Hello guys,
>> >
>> > I am reading the book "HBase, the definitive guide", at the beginning
>>of
>> > chapter 3, it is mentioned in order to reduce performance impact for
>> > clients to update the same row (lock contention issues for automatic
>> > write), batch update is preferred. My questions is, for MR job, what
>>are
>> > the batch update methods we could leverage to resolve the issue? And
>>for
>> > API client, what are the batch update methods we could leverage to
>> resolve
>> > the issue?
>> >
>>
>> Do you actually have a problem where there is contention on a single
>>row?
>>
>> Use methods like
>>
>>
>>http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.htm
>>l#put(java.util.List)
>> or the batch methods listed earlier in the API.  You should set
>> autoflush to false too:
>>
>>
>>http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTableInte
>>rface.html#isAutoFlush()
>>
>> Even batching, a highly contended row might hold up inserts... but for
>> sure you actually have this problem in the first place?
>>
>> St.Ack
>>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB