|
|
-
EC2 cloudera cc1.4xlarge
Aleksandr Elbakyan 2011-05-24, 23:23
Hello,
I am want to use cc1.4xlarge cluster for some data processing, to spin clusters I am using cloudera scripts. hadoop-ec2-init-remote.sh has default configuration until c1.xlarge but not configuration for cc1.4xlarge, can someone give formula how does this values calculated based on hardware?
C1.XLARGE MAX_MAP_TASKS=8 - mapred.tasktracker.map.tasks.maximum MAX_REDUCE_TASKS=4 - mapred.tasktracker.reduce.tasks.maximum CHILD_OPTS=-Xmx680m - mapred.child.java.opts CHILD_ULIMIT=1392640 - mapred.child.ulimit
I am guessing but I think
CHILD_OPTS = (total ram on the box - 1gb) /(MAX_MAP_TASKS, MAX_REDUCE_TASKS)
But not sure how to calculate rest
Regards, Aleksandr
-
Re: EC2 cloudera cc1.4xlarge
Aleksandr Elbakyan 2011-05-24, 23:57
I look into different cluster and configurations from cloudera and came with this number let me know what do you think...
Machine
23 GB of memory
33.5 EC2 Compute Units (2 x Intel Xeon X5570, quad-core “Nehalem” architecture)
1690 GB of instance storage
64-bit platform
I/O Performance: Very High (10 Gigabit Ethernet)
API name: cc1.4xlarge
MAX_MAP_TASKS=16 - mapred.tasktracker.map.tasks.maximum MAX_REDUCE_TASKS=8 - mapred.tasktracker.reduce.tasks.maximum CHILD_OPTS=-Xmx1024m - mapred.child.java.opts CHILD_ULIMIT=1392640 - mapred.child.ulimit
Regards, Aleksandr
--- On Tue, 5/24/11, Aleksandr Elbakyan <[EMAIL PROTECTED]> wrote:
From: Aleksandr Elbakyan <[EMAIL PROTECTED]> Subject: EC2 cloudera cc1.4xlarge To: [EMAIL PROTECTED] Date: Tuesday, May 24, 2011, 4:23 PM
Hello,
I am want to use cc1.4xlarge cluster for some data processing, to spin clusters I am using cloudera scripts. hadoop-ec2-init-remote.sh has default configuration until c1.xlarge but not configuration for cc1.4xlarge, can someone give formula how does this values calculated based on hardware?
C1.XLARGE MAX_MAP_TASKS=8 - mapred.tasktracker.map.tasks.maximum MAX_REDUCE_TASKS=4 - mapred.tasktracker.reduce.tasks.maximum CHILD_OPTS=-Xmx680m - mapred.child.java.opts CHILD_ULIMIT=1392640 - mapred.child.ulimit
I am guessing but I think
CHILD_OPTS = (total ram on the box - 1gb) /(MAX_MAP_TASKS, MAX_REDUCE_TASKS)
But not sure how to calculate rest
Regards, Aleksandr
-
Re: EC2 cloudera cc1.4xlarge
Konstantin Boudnik 2011-05-25, 01:15
Try cloudera specific lisls with your questions. -- Take care, Konstantin (Cos) Boudnik 2CAC 8312 4870 D885 8616 6115 220F 6980 1F27 E622
Disclaimer: Opinions expressed in this email are those of the author, and do not necessarily represent the views of any company the author might be affiliated with at the moment of writing.
On Tue, May 24, 2011 at 16:23, Aleksandr Elbakyan <[EMAIL PROTECTED]> wrote: > Hello, > > I am want to use cc1.4xlarge cluster for some data processing, to spin clusters I am using cloudera scripts. hadoop-ec2-init-remote.sh has default configuration until c1.xlarge but not configuration for cc1.4xlarge, can someone give formula how does this values calculated based on hardware? > > C1.XLARGE > MAX_MAP_TASKS=8 - mapred.tasktracker.map.tasks.maximum > MAX_REDUCE_TASKS=4 - mapred.tasktracker.reduce.tasks.maximum > CHILD_OPTS=-Xmx680m - mapred.child.java.opts > CHILD_ULIMIT=1392640 - mapred.child.ulimit > > I am guessing but I think > > CHILD_OPTS = (total ram on the box - 1gb) /(MAX_MAP_TASKS, MAX_REDUCE_TASKS) > > But not sure how to calculate rest > > Regards, > Aleksandr > > >
|
|