Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> High IPC Latency


Copy link to this message
-
RE: High IPC Latency
Hi Yousuf

> The client caches the locations of the regionservers, so after a couple
> of
> minutes of the experiment running, it wouldn't need to re-visit
> ZooKeeper,
> I believe. Correct me if I am wrong please.
Yes you are right.

Regards
Ram

> -----Original Message-----
> From: Yousuf Ahmad [mailto:[EMAIL PROTECTED]]
> Sent: Friday, October 19, 2012 1:30 AM
> To: [EMAIL PROTECTED]; lars hofhansl
> Cc: Ivan Brondino; Ricardo Vilaça
> Subject: Re: High IPC Latency
>
> Hi,
>
> Thank you for your questions guys.
>
> We are using HBase 0.92 with HDFS 1.0.1.
>
> The experiment lasts 15 minutes. The measurements stabilize in the
> first
> two minutes of the run.
>
> The data is distributed almost evenly across the regionservers so each
> client hits most of them over the course of the experiment. However,
> for
> the data we have, any given multi-get or scan should touch only one or
> at
> most two regions.
>
> The client caches the locations of the regionservers, so after a couple
> of
> minutes of the experiment running, it wouldn't need to re-visit
> ZooKeeper,
> I believe. Correct me if I am wrong please.
>
> Regards,
> Yousuf
>
>
> On Thu, Oct 18, 2012 at 2:42 PM, lars hofhansl <[EMAIL PROTECTED]>
> wrote:
>
> > Also, what version of HBase/HDFS is this using?
> >
> >
> >
> >
> > ----- Original Message -----
> > From: "Pamecha, Abhishek" <[EMAIL PROTECTED]>
> > To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
> > Cc: Ivan Brondino <[EMAIL PROTECTED]>; Ricardo Vilaça <
> > [EMAIL PROTECTED]>
> > Sent: Thursday, October 18, 2012 11:38 AM
> > Subject: RE: High IPC Latency
> >
> > Is it sustained for the same client hitting the same region server OR
> does
> > it get better for the same client-RS combination when run for longer
> > duration?  Trying to eliminate Zookeeper from this.
> >
> > Thanks,
> > Abhishek
> >
> > From: Yousuf Ahmad [mailto:[EMAIL PROTECTED]]
> > Sent: Thursday, October 18, 2012 11:26 AM
> > To: [EMAIL PROTECTED]
> > Cc: Ivan Brondino; Ricardo Vilaça
> > Subject: High IPC Latency
> >
> > Hello,
> >
> > We are seeing slow times for read operations in our experiments. We
> are
> > hoping that you guys can help us figure out what's going wrong.
> >
> > Here are some details:
> >
> >   *   We are running a read-only benchmark on our HBase cluster.
> >   *
> >   *   There are 10 regionservers, each co-located with a datanode.
> HDFS
> > replication is 3x.
> >   *   All the data read by the experiment is already in the block
> cache
> > and the hit ratio is 99%.
> >   *
> >   *   We have 10 clients, each with around 400 threads making a mix
> of
> > read-only requests involving multi-gets and scans.
> >   *
> >   *   We settled on the default client pool type/size (roundrobin/1)
> and a
> > regionserver handler count of 100 after testing various combinations
> to see
> > what setting worked best.
> >   *
> >   *   Our scans are short, fetching around 10 rows on average.
> Scanner
> > caching is set to 50.
> >   *   An average row in a scan has either around 10 columns (small
> row) or
> > around 200 columns (big row).
> >   *
> >   *   Our multi-gets fetch around 200 rows on average.
> >   *   An average row in a multi-get has around 10 columns.
> >   *   Each column holds an integer (encoded into bytes).
> >   *
> >   *   None of the machines involved reach CPU, memory, or IO
> saturation.
> > In fact resource utilization stays quite low.
> >   *
> >   *   Our statistics show that the average time for a scan, measured
> > starting from the first scanner.next() call to the last one which
> returns a
> > null, is around 2-3 seconds.
> >   *   Since we use scanner caching, the major portion of this time
> (around
> > 2 seconds) is spent on the first call to next(), while the remaining
> calls
> > take a negligible amount of time.
> >   *   Similarly, we see that a multi-get on average takes around 2
> seconds.
> >   *   A single get on average takes around 1 second.
> > We are not sure what the bottleneck is or where it lies. We thought