Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> Rule of thumb: Size of data to send per RPC in a scan


+
David Koch 2013-01-25, 23:59
+
Ted Yu 2013-01-26, 00:14
Copy link to this message
-
Re: Rule of thumb: Size of data to send per RPC in a scan
Hello Ted,

Thank you for the link.

/David

On Sat, Jan 26, 2013 at 1:14 AM, Ted Yu <[EMAIL PROTECTED]> wrote:

> Looks like HBASE-2214 'Do HBASE-1996 -- setting size to return in scan
> rather than count of rows -- properly' may help you.
> But that is only in 0.96
>
> Lars H presented some performance numbers in:
>   HBASE-7008 Set scanner caching to a better default, disable Nagles
> where default for "hbase.client.scanner.caching" changed to 100
>
> Cheers
>
> On Fri, Jan 25, 2013 at 3:59 PM, David Koch <[EMAIL PROTECTED]> wrote:
>
> > Hello,
> >
> > Is there a rule to determine the best batch/caching combination for
> > maximizing scan performance as a function of KV size and (average) number
> > of columns per row key?
> >
> > I have 0.5kb per value (constant), an average of 10 values per row key -
> > heavy tailed so some outliers have 100k KVs, around 100million rows in
> the
> > table. The cluster consists of 30 region servers, 24gb of RAM each, nodes
> > are connecting with a 1gbit connection. I am running Map/Reduce jobs on
> the
> > table, also with 30 task trackers.
> >
> > I tried:
> > cache: 1, no batching -> 14min
> > cache 1000, batch 50 -> 11min
> > cache 5000, batch 25 -> crash (timeouts)
> > cache 2000, batch 25 -> 15min
> >
> > Job time can vary quite significantly according to whatever activity
> > (compactions?) are going on in the background. Also, I cannot probe for
> the
> > best combination indefinitely since there actual production jobs queued.
> I
> > did expect a larger speed-up with respect to no caching/batching at all -
> > is this unjustified?
> >
> > In short, I am looking for some tips for making scans in a Map/Reduce
> > context faster :-)
> >
> > Thank you,
> >
> > /David
> >
>