Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> md5 hash key and splits


Copy link to this message
-
Re: md5 hash key and splits

Stack, re:  "Where did you read that?", I think he might also be referring
to this...

http://hbase.apache.org/book.html#important_configurations
On 8/30/12 8:04 PM, "Mohit Anchlia" <[EMAIL PROTECTED]> wrote:

>In general isn't it better to split the regions so that the load can be
>spread accross the cluster to avoid HotSpots?
>
>I read about pre-splitting here:
>
>http://blog.sematext.com/2012/04/09/hbasewd-avoid-regionserver-hotspotting
>-despite-writing-records-with-sequential-keys/
>
>On Thu, Aug 30, 2012 at 4:30 PM, Amandeep Khurana <[EMAIL PROTECTED]>
>wrote:
>
>> Also, you might have read that an initial loading of data can be better
>> distributed across the cluster if the table is pre-split rather than
>> starting with a single region and splitting (possibly aggressively,
>> depending on the throughput) as the data loads in. Once you are in a
>>stable
>> state with regions distributed across the cluster, there is really no
>> benefit in terms of spreading load by managing splitting manually v/s
>> letting HBase do it for you. At that point it's about what Ian
>>mentioned -
>> predictability of latencies by avoiding splits happening at a busy time.
>>
>> On Thu, Aug 30, 2012 at 4:26 PM, Ian Varley <[EMAIL PROTECTED]>
>> wrote:
>>
>> > The Facebook devs have mentioned in public talks that they pre-split
>> their
>> > tables and don't use automated region splitting. But as far as I
>> remember,
>> > the reason for that isn't predictability of spreading load, so much as
>> > predictability of uptime & latency (they don't want an automated
>>split to
>> > happen at a random busy time). Maybe that's what you mean, Mohit?
>> >
>> > Ian
>> >
>> > On Aug 30, 2012, at 5:45 PM, Stack wrote:
>> >
>> > On Thu, Aug 30, 2012 at 7:35 AM, Mohit Anchlia <[EMAIL PROTECTED]
>> > <mailto:[EMAIL PROTECTED]>> wrote:
>> > From what I;ve read it's advisable to do manual splits since you are
>>able
>> > to spread the load in more predictable way. If I am missing something
>> > please let me know.
>> >
>> >
>> > Where did you read that?
>> > St.Ack
>> >
>> >
>>