Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> md5 hash key and splits


+
Mohit Anchlia 2012-08-29, 22:56
+
Stack 2012-08-30, 04:19
+
Mohit Anchlia 2012-08-30, 04:38
+
Stack 2012-08-30, 05:50
+
Mohit Anchlia 2012-08-30, 14:35
+
Stack 2012-08-30, 22:45
+
Ian Varley 2012-08-30, 23:26
+
Amandeep Khurana 2012-08-30, 23:30
Copy link to this message
-
Re: md5 hash key and splits
In general isn't it better to split the regions so that the load can be
spread accross the cluster to avoid HotSpots?

I read about pre-splitting here:

http://blog.sematext.com/2012/04/09/hbasewd-avoid-regionserver-hotspotting-despite-writing-records-with-sequential-keys/

On Thu, Aug 30, 2012 at 4:30 PM, Amandeep Khurana <[EMAIL PROTECTED]> wrote:

> Also, you might have read that an initial loading of data can be better
> distributed across the cluster if the table is pre-split rather than
> starting with a single region and splitting (possibly aggressively,
> depending on the throughput) as the data loads in. Once you are in a stable
> state with regions distributed across the cluster, there is really no
> benefit in terms of spreading load by managing splitting manually v/s
> letting HBase do it for you. At that point it's about what Ian mentioned -
> predictability of latencies by avoiding splits happening at a busy time.
>
> On Thu, Aug 30, 2012 at 4:26 PM, Ian Varley <[EMAIL PROTECTED]>
> wrote:
>
> > The Facebook devs have mentioned in public talks that they pre-split
> their
> > tables and don't use automated region splitting. But as far as I
> remember,
> > the reason for that isn't predictability of spreading load, so much as
> > predictability of uptime & latency (they don't want an automated split to
> > happen at a random busy time). Maybe that's what you mean, Mohit?
> >
> > Ian
> >
> > On Aug 30, 2012, at 5:45 PM, Stack wrote:
> >
> > On Thu, Aug 30, 2012 at 7:35 AM, Mohit Anchlia <[EMAIL PROTECTED]
> > <mailto:[EMAIL PROTECTED]>> wrote:
> > From what I;ve read it's advisable to do manual splits since you are able
> > to spread the load in more predictable way. If I am missing something
> > please let me know.
> >
> >
> > Where did you read that?
> > St.Ack
> >
> >
>
+
Doug Meil 2012-08-31, 13:09
+
Stack 2012-08-31, 15:30
+
Stack 2012-08-31, 06:52
+
Mohit Anchlia 2012-08-31, 14:55
+
Stack 2012-08-31, 15:32