Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Help with adjusting Hadoop configuration files


Copy link to this message
-
Re: Help with adjusting Hadoop configuration files
Yeah it will increase performance by reducing number of mappers and making
single mapper to use more memory . So the value depends upon the application
and RAM available . For your use case i think 512MB- 1GB will be better
value.

On Tue, Jun 21, 2011 at 4:28 PM, Avi Vaknin <[EMAIL PROTECTED]> wrote:

> Hi,
> The block size is configured to 128MB, I've read that it is recommended to
> increase it in order to get better performance.
> What value do you recommend to set it ?
>
> Avi
>
> -----Original Message-----
> From: madhu phatak [mailto:[EMAIL PROTECTED]]
> Sent: Tuesday, June 21, 2011 12:54 PM
> To: [EMAIL PROTECTED]
> Subject: Re: Help with adjusting Hadoop configuration files
>
> If u reduce the default block size of dfs(which is in the configuration
> file) and if u use default inputformat it creates more no of mappers at a
> time which may help you to effectively use the RAM.. Another way is create
> as many parallel jobs as possible at pro grammatically so that uses all
> available RAM.
>
> On Tue, Jun 21, 2011 at 3:17 PM, Avi Vaknin <[EMAIL PROTECTED]> wrote:
>
> > Hi Madhu,
> > First of all, thanks for the quick reply.
> > I've searched the net about the properties of the configuration files and
> I
> > specifically wanted to know if there is
> > a property that is related to memory tuning (as you can see I have 7.5
> RAM
> > on each datanode and I really want to use it properly).
> > Also, I've changed the mapred.tasktracker.reduce/map.tasks.maximum to 10
> > (number of cores on the datanodes) and unfortunately I haven't seen any
> > change on the performance or time duration of running jobs.
> >
> > Avi
> >
> > -----Original Message-----
> > From: madhu phatak [mailto:[EMAIL PROTECTED]]
> > Sent: Tuesday, June 21, 2011 12:33 PM
> > To: [EMAIL PROTECTED]
> > Subject: Re: Help with adjusting Hadoop configuration files
> >
> > The utilization of cluster depends upon the no of jobs and no of mappers
> > and
> > reducers.The configuration files only help u set up the cluster by
> > specifying info .u can also specify some of details like block size and
> > replication in configuration files  which may help you in job
> > management.You
> > can read all the available configuration properties here
> > http://hadoop.apache.org/common/docs/current/cluster_setup.html
> >
> > On Tue, Jun 21, 2011 at 2:13 PM, Avi Vaknin <[EMAIL PROTECTED]>
> wrote:
> >
> > > Hi Everyone,
> > > We are a start-up company has been using the Hadoop Cluster platform
> > > (version 0.20.2) on Amazon EC2 environment.
> > > We tried to setup a cluster using two different forms:
> > > Cluster 1: includes 1 master (namenode) + 5 datanodes - all of the
> > machines
> > > are small EC2 instances (1.6 GB RAM)
> > > Cluster 2: includes 1 master (namenode) + 2 datanodes - the master is a
> > > small EC2 instance and the other two datanodes are large EC2 instances
> > (7.5
> > > GB RAM)
> > > We tried to make changes on the the configuration files (core-sit,
> > > hdfs-site
> > > and mapred-sit xml files) and we expected to see a significant
> > improvement
> > > on the performance of the cluster 2,
> > > unfortunately this has yet to happen.
> > >
> > > Are there any special parameters on the configuration files that we
> need
> > to
> > > change in order to adjust the Hadoop to a large hardware environment ?
> > > Are there any best practice you recommend?
> > >
> > > Thanks in advance.
> > >
> > > Avi
> > >
> > >
> > >
> > >
> >
> > -----
> > No virus found in this message.
> > Checked by AVG - www.avg.com
> > Version: 10.0.1382 / Virus Database: 1513/3707 - Release Date: 06/16/11
> >
> >
>
> -----
> No virus found in this message.
> Checked by AVG - www.avg.com
> Version: 10.0.1382 / Virus Database: 1513/3707 - Release Date: 06/16/11
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB