Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Cluster Tuning

Copy link to this message
Re: Cluster Tuning
Have you tried using a Combiner?

Here's an example of using one:



On Thu, Jul 7, 2011 at 4:29 PM, Juan P. <[EMAIL PROTECTED]> wrote:
> Hi guys!
> I'd like some help fine tuning my cluster. I currently have 20 boxes exactly
> alike. Single core machines with 600MB of RAM. No chance of upgrading the
> hardware.
> My cluster is made out of 1 NameNode/JobTracker box and 19
> DataNode/TaskTracker boxes.
> All my config is default except i've set the following in my mapred-site.xml
> in an effort to try and prevent choking my boxes.
>  *<property>*
> *      <name>mapred.tasktracker.map.tasks.maximum</name>*
> *      <value>1</value>*
> *  </property>*
> I'm running a MapReduce job which reads a Proxy Server log file (2GB), maps
> hosts to each record and then in the reduce task it accumulates the amount
> of bytes received from each host.
> Currently it's producing about 65000 keys
> The hole job takes forever to complete, specially the reduce part. I've
> tried different tuning configs by I can't bring it down under 20mins.
> Any ideas?
> Thanks for your help!
> Pony

Joseph Echeverria
Cloudera, Inc.