Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Re: Extremely slow when loading small amount of data from HBase


Copy link to this message
-
Re: Extremely slow when loading small amount of data from HBase

You have are 4000 regions on an 8 node cluster?  I think you need to bring
that *way* down…  

re:  "something like 40 regions"
Yep… around there.  See…
http://hbase.apache.org/book.html#bigger.regions

On 9/5/12 8:06 AM, "Jean-Marc Spaggiari" <[EMAIL PROTECTED]> wrote:

>But I think you should also look at why we have so many regions...
>Because even if you merge them manually now, you might face the same
>issu soon.
>
>2012/9/5, n keywal <[EMAIL PROTECTED]>:
>> Hi,
>>
>> With 8 regionservers, yes, you can. Target a few hundreds by default
>>imho.
>>
>> N.
>>
>> On Wed, Sep 5, 2012 at 4:55 AM, 某因幡 <[EMAIL PROTECTED]> wrote:
>>
>>> +HBase users.
>>>
>>>
>>> ---------- Forwarded message ----------
>>> From: Dmitriy Ryaboy <[EMAIL PROTECTED]>
>>> Date: 2012/9/4
>>> Subject: Re: Extremely slow when loading small amount of data from
>>>HBase
>>> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
>>>
>>>
>>> I think the hbase folks recommend something like 40 regions per node
>>> per table, but I might be misremembering something. Have you tried
>>> emailing the hbase users list?
>>>
>>> On Sep 4, 2012, at 3:39 AM, 某因幡 <[EMAIL PROTECTED]> wrote:
>>>
>>> > After merging ~8000 regions to ~4000 on an 8-node cluster the things
>>> > is getting better.
>>> > Should I continue merging?
>>> >
>>> >
>>> > 2012/8/29 Dmitriy Ryaboy <[EMAIL PROTECTED]>:
>>> >> Can you try the same scans with a regular hbase mapreduce job? If
>>>you
>>> see the same problem, it's an hbase issue. Otherwise, we need to see
>>>the
>>> script and some facts about your table (how many regions, how many
>>>rows,
>>> how big a cluster, is the small range all on one region server, etc)
>>> >>
>>> >> On Aug 27, 2012, at 11:49 PM, 某因幡 <[EMAIL PROTECTED]> wrote:
>>> >>
>>> >>> When I load a range of data from HBase simply using row key range
>>>in
>>> >>> HBaseStorageHandler, I find that the speed is acceptable when I'm
>>> >>> trying to load some tens of millions rows or more, while the only
>>>map
>>> >>> ends up in a timeout when it's some thousands of rows.
>>> >>> What is going wrong here? Tried both Pig-0.9.2 and Pig-0.10.0.
>>> >>>
>>> >>>
>>> >>> --
>>> >>> language: Chinese, Japanese, English
>>> >
>>> >
>>> >
>>> > --
>>> > language: Chinese, Japanese, English
>>>
>>>
>>> --
>>> language: Chinese, Japanese, English
>>>
>>
>