-Hadoop in a Heterogeneous Environment - taking advantage of different processor specs
Saptarshi Guha 2009-07-28, 14:40
Not sure if this has been asked or answered.
Suppose I have tasktrackers A1,A2,A3 each with 4 cores and 16GB ram.
mapred.tasktracker.map.tasks.maximum = 6
mapred.tasktracker.reduce.tasks.maximum = 4
Now suppose I have one more machine(X) with 8 cores and 32GB ram.
Since (if i'm not mistaken) tasktrackers talk to the jobtracker, can I take
advantage of X by
a) A1,A2,A3 each have the same hadoop-site.xml with the above values for
b.0) on B, i have a hadoop-site.xml with these valus
mapred.tasktracker.map.tasks.maximum = 5
mapred.tasktracker.reduce.tasks.maximum = 3
and start * one * tasktracker.
b.1) Then edit hadoop-site.xml, change the tasktracker port (if there is
such a thing) and
b.2) start *another* tasktracker.
We can skip b.1) if there is no such thing as a tasktracker port.
Hence I wil have /two/ tasktrackers running on X, one on Ai. and thus take
advantage of X.
Is this at all possible? Or am i talking nonsense?
ANy pointers appreciated