-Re: MapReduce tunning
Mohit Anchlia 2012-02-25, 21:20
On Sat, Feb 25, 2012 at 7:11 AM, Jie Li <[EMAIL PROTECTED]> wrote:
> Hello Mohit,
> I am looking at some hadoop tuning parameters like io.sort.mb,
>> mapred.child.javaopts etc.
> - My question was where to look at for current setting
> The default settings as well as the documentations can be found in Hadoop
>> - Are these settings configured cluster wide or per job?
> Some settings are configured cluster wide, e.g. the number of map/reduce
> slots per node, while some settings are configured per job, e.g.
> io.sort.mb. It depends on the functionality of that specific parameter.
For cluster wide setting I am assuming it needs a cluster restart?
>> - What's the best way to look at reasons of slow performance?
> Well, I want to introduce Starfish to you. Starfish is a self-tuning
> system built on Hadoop to provide good performance automatically, without
> any need for users to understand and manipulate the many tuning knobs in
> With Starfish, you can analyze the performance of your Hadoop job at fine
> grained level, e.g. the time for map processing, spilling, merging,
> shuffling, sorting, and reduce processing. So you can understand which
> part is the bottleneck of the performance.
> You can also ask "what-if" questions, e.g. "What if I double io.sort.mb
> ?", and Starfish will predict the new behaviour of the job, so you can
> better understand how these parameters work. In addition, you can simply
> delegate Starfish to find the optimal configurations for you to achieve the
> best performance.
> Welcome to join our Google Group to discuss more about Starfish and any
> feedback will be appreciated. If you meet any problems, please don't
> hesitate to let us know. The Group address is
> Starfish Group, Duke University
> Starfish Homepage: www.cs.duke.edu/starfish/
> Starfish Google Group: http://groups.google.com/group/hadoop-starfish