Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - batch update question


Copy link to this message
-
Re: batch update question
Doug Meil 2012-09-05, 17:01

Hi there, for more information about the hbase client, seeŠ

http://hbase.apache.org/book.html#client

On 9/5/12 12:59 PM, "Doug Meil" <[EMAIL PROTECTED]> wrote:

>
>Hi there, if you look in the source code for HTable there is a list of Put
>objects.  That's the buffer, and it's a client-side buffer.
>
>
>
>
>
>On 9/5/12 12:04 PM, "Lin Ma" <[EMAIL PROTECTED]> wrote:
>
>>Thank you Stack for the details directions!
>>
>>1. You are right, I have not met with any real row contention issues. My
>>purpose is understanding the issue in advance, and also from this issue
>>to
>>understand HBase generals better;
>>2. For the comments from API Url page you referred -- "If
>>isAutoFlush<http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/clien
>>t
>>/HTableInterface.html#isAutoFlush%28%29>is
>>false, the update is buffered until the internal buffer is full.", I
>>am
>>confused what is the buffer? Buffer at client side or buffer in region
>>server? Is there a way to configure its size to hold until flushing?
>>3. Why batch could resolve contention on the same raw issue in theory,
>>compared to non-batch operation? Besides preparation the solution in my
>>mind in advance, I want to learn a bit about why. :-)
>>
>>regards,
>>Lin
>>
>>On Wed, Sep 5, 2012 at 4:00 AM, Stack <[EMAIL PROTECTED]> wrote:
>>
>>> On Sun, Sep 2, 2012 at 2:13 AM, Lin Ma <[EMAIL PROTECTED]> wrote:
>>> > Hello guys,
>>> >
>>> > I am reading the book "HBase, the definitive guide", at the beginning
>>>of
>>> > chapter 3, it is mentioned in order to reduce performance impact for
>>> > clients to update the same row (lock contention issues for automatic
>>> > write), batch update is preferred. My questions is, for MR job, what
>>>are
>>> > the batch update methods we could leverage to resolve the issue? And
>>>for
>>> > API client, what are the batch update methods we could leverage to
>>> resolve
>>> > the issue?
>>> >
>>>
>>> Do you actually have a problem where there is contention on a single
>>>row?
>>>
>>> Use methods like
>>>
>>>
>>>http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.ht
>>>m
>>>l#put(java.util.List)
>>> or the batch methods listed earlier in the API.  You should set
>>> autoflush to false too:
>>>
>>>
>>>http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTableInt
>>>e
>>>rface.html#isAutoFlush()
>>>
>>> Even batching, a highly contended row might hold up inserts... but for
>>> sure you actually have this problem in the first place?
>>>
>>> St.Ack
>>>
>