Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce, mail # user - Hadoop Streaming job error - Need help urgent


+
prithvi dammalapati 2013-04-22, 17:11
+
prithvi dammalapati 2013-04-22, 19:04
Copy link to this message
-
Re: Hadoop Streaming job error - Need help urgent
Chris Nauroth 2013-04-22, 19:17
OK, great.  It looks like with the change to "$input_path", you've made
progress.

Now it's actually submitting the job, but something is causing the map
tasks to fail.  Usually, this is some kind of bug in user code, so you'll
need to do some further investigation on your side.  I expect the tracking
URL mentioned in the output above will give you some clues.  That should
also steer you towards the individual task log outputs.

--Chris

On Mon, Apr 22, 2013 at 12:04 PM, prithvi dammalapati <
[EMAIL PROTECTED]> wrote:

> java_home=/usr/lib/jvm/java-1.7.0-openjdk-amd64
> hadoop_home=/usr/local/hadoop/hadoop-1.0.4
> hadoop_lib=$hadoop_home/hadoop-core-1.0.4.jar
> hadoop_bin=$hadoop_home/bin/hadoop
> hadoop_config=$hadoop_home/conf
> hadoop_streaming=$hadoop_home/contrib/streaming/hadoop-streaming-1.0.4.jar
> #task specific parameters
> source_code=BetweennessCentrality.java
> jar_file=BetweennessCentrality.jar
> main_class=mslab.BetweennessCentrality
> num_of_node=38012
> num_of_mapper=100
> num_of_reducer=8
> input_path=/data/dblp_author_conf_adj.txt
> output_path=dblp_bc_N$(($num_
> of_node))_M$((num_of_mapper))
> rm build -rf
> mkdir build
> $java_home/bin/javac -d build -classpath .:$hadoop_lib src/mslab/$source_code
> rm $jar_file -f
> $java_home/bin/jar -cf $jar_file -C build/ .
> $hadoop_bin --config $hadoop_config fs -rmr $output_path
> $hadoop_bin --config $hadoop_config jar $jar_file $main_class $num_of_node       $num_of_mapper
>
> rm brandes_mapper
>
> g++ src/mslab/mapred_brandes.cpp -O3 -o brandes_mapper
> $hadoop_bin --config $hadoop_config jar $hadoop_streaming -D mapred.task.timeout=0 -D mapred.job.name="BC_N$((num_of_node))_M$((num_of_mapper))" -D mapred.reduce.tasks=$num_of_reducer -input input_BC_N$((num_of_node))_M$((num_of_mapper)) -output $output_path -file brandes_mapper -file src/mslab/BC_reducer.py -file src/mslab/MapReduceUtil.py -file $input_path -mapper "./brandes_mapper $input_path $num_of_node" -reducer "./BC_reducer.py"
>
> After running this code, I get the following error
> 13/04/22 12:29:44 INFO streaming.StreamJob:  map 0%  reduce 0%
> 13/04/22 12:30:01 INFO streaming.StreamJob:  map 20%  reduce 0%
> 13/04/22 12:30:10 INFO streaming.StreamJob:  map 40%  reduce 0%
> 13/04/22 12:30:13 INFO streaming.StreamJob:  map 40%  reduce 2%
> 13/04/22 12:30:16 INFO streaming.StreamJob:  map 40%  reduce 13%
> 13/04/22 12:30:19 INFO streaming.StreamJob:  map 60%  reduce 13%
> 13/04/22 12:30:28 INFO streaming.StreamJob:  map 60%  reduce 17%
> 13/04/22 12:30:31 INFO streaming.StreamJob:  map 60%  reduce 20%
> 13/04/22 12:31:01 INFO streaming.StreamJob:  map 100%  reduce 100%
> 13/04/22 12:31:01 INFO streaming.StreamJob: To kill this job, run:
> 13/04/22 12:31:01 INFO streaming.StreamJob: /usr/local/hadoop/hadoop-1.0.4/libexec/../bin/hadoop job  -Dmapred.job.tracker=localhost:54311 -kill job_201304221215_0002
> 13/04/22 12:31:01 INFO streaming.StreamJob: Tracking URL: http://localhost:50030/jobdetails.jsp?jobid=job_201304221215_0002
> 13/04/22 <http://localhost:50030/jobdetails.jsp?jobid=job_201304221215_000213/04/22> 12:31:01 ERROR streaming.StreamJob: Job not successful. Error: # of failed Map Tasks exceeded allowed limit. FailedCount: 1. LastFailedTask: task_201304221215_0002_m_000006
> 13/04/22 12:31:01 INFO streaming.StreamJob: killJob...
>
> Even if i reduce the num_of_nodes to 10 and no_of_mappers to 10 I get the same error. Can someone help me solve this error
>
> Any help is appreciated
>
> Thanks
>
> Prithvi
>
>
>
> On Mon, Apr 22, 2013 at 12:49 PM, Chris Nauroth <[EMAIL PROTECTED]>wrote:
>
>> (Moving to user list, hdfs-dev bcc'd.)
>>
>> Hi Prithvi,
>>
>> From a quick scan, it looks to me like one of your commands ends up using
>> "input_path" as a string literal instead of replacing with the value of the
>> input_path variable.  I've pasted the command below.  Notice that one of
>> the -file options used "input_path" instead of "$input_path".
>>
>> Is that the problem?
>>
>> Hope this helps,