Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop, mail # user - Hadoop cluster optimization


+
Avi Vaknin 2011-08-21, 11:57
+
stanley.shi@... 2011-08-22, 01:46
+
Michel Segel 2011-08-22, 02:17
Copy link to this message
-
Re: Hadoop cluster optimization
Allen Wittenauer 2011-08-22, 04:05

On Aug 21, 2011, at 7:17 PM, Michel Segel wrote:

> Avi,
> First why 32 bit OS?
> You have a 64 bit processor that has 4 cores hyper threaded looks like 8cpus.

With only 1.7gb of mem, there likely isn't much of a reason to use a 64-bit OS.  The machines (as you point out) are already tight on memory.  64-bit is only going to make it worse.

>>
>> 1.7 GB memory
>> 1 Intel(R) Xeon(R) CPU E5507 @ 2.27GHz
>> Ubuntu Server 10.10 , 32-bit platform
>> Cloudera CDH3 Manual Hadoop Installation
>> (for the ones who are familiar with Amazon Web Services, I am talking about
>> Small EC2 Instances/Servers)
>>
>> Total job run time is +-15 minutes (+-50 files/blocks/mapTasks of up to 250
>> MB and 10 reduce tasks).
>>
>> Based on the above information, does anyone can recommend on a best practice
>> configuration??

How many spindles?  Are your tasks spilling?
>> Do you thinks that when dealing with such a small cluster, and when
>> processing such a small amount of data,
>> is it even possible to optimize jobs so they would run much faster?

Most of the time, performance issues are with the algorithm, not Hadoop.
+
Avi Vaknin 2011-08-22, 09:55
+
אבי ווקנין 2011-08-22, 10:00
+
Ian Michael Gumby 2011-08-22, 13:34
+
Allen Wittenauer 2011-08-22, 16:19