Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # dev >> Problem while submitting jobs to NM started with ephemeral ports.


Copy link to this message
-
Re: Problem while submitting jobs to NM started with ephemeral ports.
also I tried commenting out two last two properties in yarn-site
mentioned above. And keeping the following property in mapred-site

    <property>
      <name> mapreduce.shuffle.port</name>
      <value>0</value>
    </property>

I got this exception while running a wordcount.

 mapreduce.Job (Job.java:printTaskEvents(1315)) - Task Id :
attempt_1318840789401_0005_r_000000_0, Status : FAILED
org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in
shuffle in fetcher#5
at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:126)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:365)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:147)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1152)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:142)
Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out.
at org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler.checkReducerHealth(ShuffleScheduler.java:253)
at org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler.copyFailed(ShuffleScheduler.java:187)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:227)
at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:149)
And everything works out of the box otherwise.

Thanks,
Prashant.

On Mon, Oct 17, 2011 at 2:03 PM, Prashant Sharma
<[EMAIL PROTECTED]> wrote:
> I am using following properties in yarn-site
>
> <property>
> <name>yarn.nodemanager.aux-services</name>
> <value>mapreduce.shuffle</value>
> </property>
>  <property>
> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
> <value>org.apache.hadoop.mapred.ShuffleHandler</value>
> </property>
>  <property>
>    <name>yarn.nodemanager.address</name>
>    <value>localhost:0</value>
>  </property>
>  <property>
>    <name>yarn.nodemanager.localizer.address</name>
>    <value>localhost:0</value>
>  </property>
>
> Everything runs fine. (means all daemons are started perfectly) But
> when you try to submit the job. Job is stuck and NM logs says trying
> to connect to 'localhost:0'. Localization takes forever. Why?
>
> Please see the NM logs below.
>
> http://pastebin.com/QfQDZeqF
>
> Thanks,
> Prashant
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB