Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> Profiling Hadoop Map Reduce with the 20.2 API


+
David Jurgens 2010-08-12, 22:33
+
Hemanth Yamijala 2010-08-13, 05:16
+
David Jurgens 2010-08-16, 20:34
Copy link to this message
-
Re: Profiling Hadoop Map Reduce with the 20.2 API
David,

>   It looks like calling configuration.setBoolean("mapred.task.profile",
> true) will enable profiling with the 20.2 APIs.  I am able to see the
> profiling output when I check the web interface.  Thanks for your help!  Is
> there a good place to document this setting so others can find this
> information?

Forrest documentation for MapReduce
(http://hadoop.apache.org/common/docs/r0.20.0/mapred_tutorial.html)
should be updated. I do see there is mention of these configuration
parameters in the Profiling section, but it doesn't explicitly mention
setting them using the configuration API. That in turn is because
JobConf is still a preferred way of setting parameters in the Hadoop
0.20 major release. Later versions of the documentation will hopefully
correct this.

Thanks
hemanth

>
> On Thu, Aug 12, 2010 at 10:16 PM, Hemanth Yamijala <[EMAIL PROTECTED]>
> wrote:
>>
>> Hi,
>>
>> >   I recently started developing with Hadoop using the 20.2 API.  I'm
>> > looking
>> > to profile one of my jobs but I haven't been able to find any
>> > documentation
>> > about how to do this.  For the earlier (deprecated) API, there's some
>> > documentation on how to profile with the JobConf class
>> >
>> > (http://hadoop.apache.org/common/docs/current/mapred_tutorial.html#Profiling).
>> > Is there anything equivalent in 20.2, or is there another process to use
>> > for
>> > it?  Search engines and javadoc searching haven't turned up anything so
>> > far
>> > for me.
>>
>> All APIs in the deprecated JobConf class are wrappers around the
>> Configuration API that you probably are using in the 20.2 new API. For
>> e.g. :
>>
>> setProfileEnabled = configuration.setBoolean("mapred.task.profile", value)
>> setProfileTaskRange = configuration.set("mapred.task.profile.maps",
>> value) or configuration.set("mapred.task.profile.reduces", value)
>> setProfileParams = configuration.set("mapred.task.profile.params", value)
>>
>> Could you try setting these parameters in the Configuration you might
>> be using for the Job and see if that works ?
>>
>> Thanks
>> Hemanth
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB