Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> HBase Thrift inserts bottlenecked somewhere -- but where?


Copy link to this message
-
Re: HBase Thrift inserts bottlenecked somewhere -- but where?
I think there is a also a parameter for controlling the queue side. If you
are opening persistent connections (connections that never close), you
should probably set the queue size to 0. Because those connections will
anyways never get threads to serve them since the connections that go
through first, will hog the thread pool.

If you see "queue is full" errors in the thrift log, then there is a
bottleneck on thrift side in terms of number of workers. Of course the
assumption is that you have solved the basic problem of distributing the
load across region servers.

On Fri, Mar 1, 2013 at 10:52 AM, Ted Yu <[EMAIL PROTECTED]> wrote:

> Here're the parameters you should look at:
>
>
> hbase-server/src/main/java/org/apache/hadoop/hbase/thrift/HThreadedSelectorServerArgs.java:
>      "hbase.thrift.selector.threads";
>
> hbase-server/src/main/java/org/apache/hadoop/hbase/thrift/HThreadedSelectorServerArgs.java:
>      "hbase.thrift.worker.threads";
>
> hbase-server/src/main/java/org/apache/hadoop/hbase/thrift/TBoundedThreadPoolServer.java:
>      "hbase.thrift.threadKeepAliveTimeSec";
>
> Often times, source code is the best help :-)
>
> On Fri, Mar 1, 2013 at 10:49 AM, Dan Crosta <[EMAIL PROTECTED]> wrote:
>
> > We are generating the load from multiple machines, yes.
> >
> > Do you happen to know what the name of the setting for the number of
> > ThriftServer threads is called? I can't find anything that is obviously
> > about that in the CDH manager.
> >
> > - Dan
> >
> >
> > On Mar 1, 2013, at 1:46 PM, Varun Sharma wrote:
> >
> > > Did you try running 30-40 proc(s) on one machine and another 30-40
> > proc(s)
> > > on another machine to see if that doubles the throughput ?
> > >
> > > On Fri, Mar 1, 2013 at 10:46 AM, Varun Sharma <[EMAIL PROTECTED]>
> > wrote:
> > >
> > >> Hi,
> > >>
> > >> I don't know how many worker threads you have at the thrift servers.
> > Each
> > >> thread gets dedicated to a single connection and only serves that
> > >> connection. New connections get queued. Also, are you sure that you
> are
> > not
> > >> saturating the client side making the calls ?
> > >>
> > >> Varun
> > >>
> > >>
> > >> On Fri, Mar 1, 2013 at 9:33 AM, Jean-Daniel Cryans <
> [EMAIL PROTECTED]
> > >wrote:
> > >>
> > >>> The primary unit of load distribution in HBase is the region, make
> > >>> sure you have more than one. This is well documented in the manual
> > >>> http://hbase.apache.org/book/perf.writing.html
> > >>>
> > >>> J-D
> > >>>
> > >>> On Fri, Mar 1, 2013 at 4:17 AM, Dan Crosta <[EMAIL PROTECTED]> wrote:
> > >>>> We are using a 6-node HBase cluster with a Thrift Server on each of
> > the
> > >>> RegionServer nodes, and trying to evaluate maximum write throughput
> > for our
> > >>> use case (which involves many processes sending mutateRowsTs
> commands).
> > >>> Somewhere between about 30 and 40 processes writing into the system
> we
> > >>> cross the threshold where adding additional writers yields only very
> > >>> limited returns to throughput, and I'm not sure why. We see that the
> > CPU
> > >>> and Disk on the DataNode/RegionServer/ThriftServer machines are not
> > >>> saturated, nor is the NIC in those machines. I'm a little unsure
> where
> > to
> > >>> look next.
> > >>>>
> > >>>> A little more detail about our deployment:
> > >>>>
> > >>>> * CDH 4.1.2
> > >>>> * DataNode/RegionServer/ThriftServer class: EC2 m1.xlarge
> > >>>> ** RegionServer: 8GB heap
> > >>>> ** ThriftServer: 1GB heap
> > >>>> ** DataNode: 4GB heap
> > >>>> ** EC2 ephemeral (i.e. local, not EBS) volumes used for HDFS
> > >>>>
> > >>>> If there's any other information that I can provide, or any other
> > >>> configuration or system settings I should look at, I'd appreciate the
> > >>> pointers.
> > >>>>
> > >>>> Thanks,
> > >>>> - Dan
> > >>>
> > >>
> > >>
> >
> >
>