Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> md5 hash key and splits


Copy link to this message
-
Re: md5 hash key and splits
On Wed, Aug 29, 2012 at 10:50 PM, Stack <[EMAIL PROTECTED]> wrote:

> On Wed, Aug 29, 2012 at 9:38 PM, Mohit Anchlia <[EMAIL PROTECTED]>
> wrote:
> > On Wed, Aug 29, 2012 at 9:19 PM, Stack <[EMAIL PROTECTED]> wrote:
> >
> >>  On Wed, Aug 29, 2012 at 3:56 PM, Mohit Anchlia <[EMAIL PROTECTED]
> >
> >> wrote:
> >> > If I use md5 hash + timestamp rowkey would hbase automatically detect
> the
> >> > difference in ranges and peforms split? How does split work in such
> cases
> >> > or is it still advisable to manually split the regions.
> >>
> >
> > What logic would you recommend to split the table into multiple regions
> > when using md5 hash?
> >
>
> Its hard to know how well your inserts will spread over the md5
> namespace ahead of time.  You could try sampling or just let HBase
> take care of the splits for you (Is there a problem w/ your letting
> HBase do the splits?)
>
> From what I;ve read it's advisable to do manual splits since you are able
to spread the load in more predictable way. If I am missing something
please let me know.
> St.Ack
>