Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> PerformanceEvaluation results


Copy link to this message
-
Re: PerformanceEvaluation results
Tim,

Here's the problem in a nutshell,
With respect to hardware, you have  5.4k rpms ? 6 drive and 8 cores?
Small slow drives, and still  a ratio less than one when you compare drives to spindles.

I appreciate that you want to maximize performance, but when it comes to tuning, you have to start before you get your hardware.

 You are asking a question about tuning, but how can we answer if the numbers are ok?
Have you looked at your GCs and implemented mslabs? We don't know. Network configuration?

I mean that there's a lot missing and fine tuning a cluster is something you have to do on your own. I guess I could say your numbers look fine to me for that config... But honestly, it would be a swag.
Sent from a remote device. Please excuse any typos...

Mike Segel

On Feb 1, 2012, at 7:09 AM, Tim Robertson <[EMAIL PROTECTED]> wrote:

> Thanks Michael,
>
> It's a small cluster, but is the hardware so bad?  We are particularly
> interested in relatively low load for random read write (2000
> transactions per second on <1k rows) but a decent full table scan
> speed, as we aim to mount Hive tables on HBase backed tables.
>
> Regarding tuning... not exactly sure which you would be interested in
> seeing.  The config is all here:
> http://code.google.com/p/gbif-common-resources/source/browse/#svn%2Fcluster-puppet%2Fmodules%2Fhadoop%2Ftemplates
>
> Cheers,
> Tim
>
>
>
> On Wed, Feb 1, 2012 at 1:56 PM, Michael Segel <[EMAIL PROTECTED]> wrote:
>> No.
>> What tuning did you do?
>> Why such a small cluster?
>>
>> Sorry, but when you start off with a bad hardware configuration, you can get Hadoop/HBase to work, but performance will always be sub-optimal.
>>
>>
>>
>> Sent from my iPhone
>>
>> On Feb 1, 2012, at 6:52 AM, "Tim Robertson" <[EMAIL PROTECTED]> wrote:
>>
>>> Hi all,
>>>
>>> We have a 3 node cluster (CD3u2) with the following hardware:
>>>
>>> RegionServers (+DN + TT)
>>>  CPU: 2x Intel(R) Xeon(R) CPU E5630 @ 2.53GHz (quad)
>>>  Disks: 6x250G SATA 5.4K
>>>  Memory: 24GB
>>>
>>> Master (+ZK, JT, NN)
>>>  CPU: Intel(R) Xeon(R) CPU X3363 @ 2.83GHz, 2x6MB (quad)
>>>  Disks: 2x500G SATA 7.2K
>>>  Memory: 8GB
>>>
>>> Memory wise, we have:
>>> Master:
>>>  NN: 1GB
>>>  JT: 1GB
>>>  HBase master: 6GB
>>>  ZK: 1GB
>>> RegionServers:
>>>  RegionServer: 6GB
>>>  TaskTracker: 1GB
>>>  11 Mappers @ 1GB each
>>>  7 Reducers @ 1GB each
>>>
>>> HDFS was empty, and I ran randomWrite and scan both with number
>>> clients of 50 (seemed to spawn 500 Mappers though...)
>>>
>>> randomWrite:
>>> 12/02/01 13:27:47 INFO mapred.JobClient:     ROWS=52428500
>>> 12/02/01 13:27:47 INFO mapred.JobClient:     ELAPSED_TIME=84504886
>>>
>>> scan:
>>> 12/02/01 13:42:52 INFO mapred.JobClient:     ROWS=52428500
>>> 12/02/01 13:42:52 INFO mapred.JobClient:     ELAPSED_TIME=8158664
>>>
>>> Would I be correct in thinking that this is way below what is to be
>>> expected of this hardware?
>>> We're setting up ganglia now to start debugging, but any suggestions
>>> on how to diagnose this would be greatly appreciated.
>>>
>>> Thanks!
>>> Tim
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB