|
|
-
Questions on hadoop configuration for heterogeneous cluster
Yu Li 2010-12-16, 09:18
Hi all,
I'm a little confused about how to configure hadoop in a heterogeneous cluster. For example, if I have one machine(m1) with a two-core processor, another(m2) with a four-core processor, and I'd like to use them as tasktracker nodes in a hadoop cluster, how could I configure the mapred.tasktracker.map/reduce.tasks.maximum? Could I set both parameters to 2 on m1, and set to 4 on m2? Or I have to set both to 2 on JT node? In another word, among those massive of parameters in *-site.xml and environtment variables in hadoop-env.sh, which ones could be set on each DN/TT with different values and still take effect?
Thanks in advance and look forward to your reply.
-- Best Regards, Li Yu
-
Re: Questions on hadoop configuration for heterogeneous cluster
Harsh J 2010-12-16, 10:08
Hi,
On Thu, Dec 16, 2010 at 2:48 PM, Yu Li <[EMAIL PROTECTED]> wrote: > Hi all, > > I'm a little confused about how to configure hadoop in a heterogeneous > cluster. For example, if I have one machine(m1) with a two-core > processor, another(m2) with a four-core processor, and I'd like to use them > as tasktracker nodes in a hadoop cluster, how could I configure the > mapred.tasktracker.map/reduce.tasks.maximum? Could I set both parameters to > 2 on m1, and set to 4 on m2?
Yes you can do this just fine. Note that the configuration property name says "tasktracker" meaning it is a tasktracker specific setting and can vary for each. Has nothing to do with the JobTracker.
> another word, among those massive of parameters in *-site.xml and > environtment variables in hadoop-env.sh, which ones could be set on each > DN/TT with different values and still take effect?
*.tasktracker.* and *.datanode.* properties are TT and DN specific and can be set individually for each of them. This is due to a naming convention followed by Hadoop.
-- Harsh J www.harshj.com
-
Re: Questions on hadoop configuration for heterogeneous cluster
Yu Li 2010-12-17, 01:46
Hi Harsh,
Thanks a lot for your reply, this really helps!
On 16 December 2010 18:08, Harsh J <[EMAIL PROTECTED]> wrote:
> Hi, > > On Thu, Dec 16, 2010 at 2:48 PM, Yu Li <[EMAIL PROTECTED]> wrote: > > Hi all, > > > > I'm a little confused about how to configure hadoop in a heterogeneous > > cluster. For example, if I have one machine(m1) with a two-core > > processor, another(m2) with a four-core processor, and I'd like to use > them > > as tasktracker nodes in a hadoop cluster, how could I configure the > > mapred.tasktracker.map/reduce.tasks.maximum? Could I set both parameters > to > > 2 on m1, and set to 4 on m2? > > Yes you can do this just fine. Note that the configuration property > name says "tasktracker" meaning it is a tasktracker specific setting > and can vary for each. Has nothing to do with the JobTracker. > > > another word, among those massive of parameters in *-site.xml and > > environtment variables in hadoop-env.sh, which ones could be set on each > > DN/TT with different values and still take effect? > > *.tasktracker.* and *.datanode.* properties are TT and DN specific and > can be set individually for each of them. This is due to a naming > convention followed by Hadoop. > > -- > Harsh J > www.harshj.com >
-- Best Regards, Li Yu
|
|