Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase, mail # user - HBase Thrift inserts bottlenecked somewhere -- but where?


+
Dan Crosta 2013-03-01, 12:17
+
Asaf Mesika 2013-03-01, 14:13
+
Dan Crosta 2013-03-01, 14:17
+
Jean-Daniel Cryans 2013-03-01, 17:33
+
Varun Sharma 2013-03-01, 18:46
Copy link to this message
-
Re: HBase Thrift inserts bottlenecked somewhere -- but where?
Varun Sharma 2013-03-01, 18:46
Did you try running 30-40 proc(s) on one machine and another 30-40 proc(s)
on another machine to see if that doubles the throughput ?

On Fri, Mar 1, 2013 at 10:46 AM, Varun Sharma <[EMAIL PROTECTED]> wrote:

> Hi,
>
> I don't know how many worker threads you have at the thrift servers. Each
> thread gets dedicated to a single connection and only serves that
> connection. New connections get queued. Also, are you sure that you are not
> saturating the client side making the calls ?
>
> Varun
>
>
> On Fri, Mar 1, 2013 at 9:33 AM, Jean-Daniel Cryans <[EMAIL PROTECTED]>wrote:
>
>> The primary unit of load distribution in HBase is the region, make
>> sure you have more than one. This is well documented in the manual
>> http://hbase.apache.org/book/perf.writing.html
>>
>> J-D
>>
>> On Fri, Mar 1, 2013 at 4:17 AM, Dan Crosta <[EMAIL PROTECTED]> wrote:
>> > We are using a 6-node HBase cluster with a Thrift Server on each of the
>> RegionServer nodes, and trying to evaluate maximum write throughput for our
>> use case (which involves many processes sending mutateRowsTs commands).
>> Somewhere between about 30 and 40 processes writing into the system we
>> cross the threshold where adding additional writers yields only very
>> limited returns to throughput, and I'm not sure why. We see that the CPU
>> and Disk on the DataNode/RegionServer/ThriftServer machines are not
>> saturated, nor is the NIC in those machines. I'm a little unsure where to
>> look next.
>> >
>> > A little more detail about our deployment:
>> >
>> > * CDH 4.1.2
>> > * DataNode/RegionServer/ThriftServer class: EC2 m1.xlarge
>> > ** RegionServer: 8GB heap
>> > ** ThriftServer: 1GB heap
>> > ** DataNode: 4GB heap
>> > ** EC2 ephemeral (i.e. local, not EBS) volumes used for HDFS
>> >
>> > If there's any other information that I can provide, or any other
>> configuration or system settings I should look at, I'd appreciate the
>> pointers.
>> >
>> > Thanks,
>> >  - Dan
>>
>
>
+
Dan Crosta 2013-03-01, 18:49
+
Ted Yu 2013-03-01, 18:52
+
Varun Sharma 2013-03-01, 19:01
+
Ted Yu 2013-03-02, 03:53
+
Dan Crosta 2013-03-02, 17:15
+
lars hofhansl 2013-03-02, 03:42
+
Dan Crosta 2013-03-02, 17:12
+
lars hofhansl 2013-03-02, 17:38
+
Dan Crosta 2013-03-02, 18:47
+
Asaf Mesika 2013-03-02, 19:56
+
Ted Yu 2013-03-02, 20:02
+
lars hofhansl 2013-03-02, 20:50
+
lars hofhansl 2013-03-02, 20:50
+
Dan Crosta 2013-03-02, 22:29
+
Varun Sharma 2013-03-03, 11:08
+
Dan Crosta 2013-03-03, 13:53
+
lars hofhansl 2013-03-02, 20:56
+
Andrew Purtell 2013-03-05, 07:04