I'm a little confused about how to configure hadoop in a heterogeneous
cluster. For example, if I have one machine(m1) with a two-core
processor, another(m2) with a four-core processor, and I'd like to use them
as tasktracker nodes in a hadoop cluster, how could I configure the
mapred.tasktracker.map/reduce.tasks.maximum? Could I set both parameters to
2 on m1, and set to 4 on m2? Or I have to set both to 2 on JT node? In
another word, among those massive of parameters in *-site.xml and
environtment variables in hadoop-env.sh, which ones could be set on each
DN/TT with different values and still take effect?
Thanks in advance and look forward to your reply.
Harsh J 2010-12-16, 10:08
Yu Li 2010-12-17, 01:46