Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # dev >> Re: Hadoop Streaming job error - Need help urgent


Copy link to this message
-
Re: Hadoop Streaming job error - Need help urgent
(Moving to user list, hdfs-dev bcc'd.)

Hi Prithvi,

>From a quick scan, it looks to me like one of your commands ends up using
"input_path" as a string literal instead of replacing with the value of the
input_path variable.  I've pasted the command below.  Notice that one of
the -file options used "input_path" instead of "$input_path".

Is that the problem?

Hope this helps,
--Chris

    $hadoop_bin --config $hadoop_config jar $hadoop_streaming -D
mapred.task.timeout=0 -D
mapred.job.name="BC_N$((num_of_node))_M$((num_of_mapper))"
-D mapred.reduce.tasks=$num_of_reducer -input
input_BC_N$((num_of_node))_M$((num_of_mapper))
-output $output_path -file brandes_mapper -file src/mslab/BC_reducer.py
-file src/mslab/MapReduceUtil.py -file input_path -mapper "./brandes_mapper
$input_path $num_of_node" -reducer "./BC_reducer.py"

On Mon, Apr 22, 2013 at 10:11 AM, prithvi dammalapati <
[EMAIL PROTECTED]> wrote:

> I have the following hadoop code to find the betweenness centrality of a
> graph
>
>     java_home=/usr/lib/jvm/java-1.7.0-openjdk-amd64
>     hadoop_home=/usr/local/hadoop/hadoop-1.0.4
>     hadoop_lib=$hadoop_home/hadoop-core-1.0.4.jar
>     hadoop_bin=$hadoop_home/bin/hadoop
>     hadoop_config=$hadoop_home/conf
>
> hadoop_streaming=$hadoop_home/contrib/streaming/hadoop-streaming-1.0.4.jar
>     #task specific parameters
>     source_code=BetweennessCentrality.java
>     jar_file=BetweennessCentrality.jar
>     main_class=mslab.BetweennessCentrality
>     num_of_node=38012
>     num_of_mapper=100
>     num_of_reducer=8
>     input_path=/data/dblp_author_conf_adj.txt
>     output_path=dblp_bc_N$(($num_of_node))_M$((num_of_mapper))
>     rm build -rf
>     mkdir build
>     $java_home/bin/javac -d build -classpath .:$hadoop_lib
> src/mslab/$source_code
>     rm $jar_file -f
>     $java_home/bin/jar -cf $jar_file -C build/ .
>     $hadoop_bin --config $hadoop_config fs -rmr $output_path
>     $hadoop_bin --config $hadoop_config jar $jar_file $main_class
> $num_of_node       $num_of_mapper
>
>     rm brandes_mapper
>
>     g++ src/mslab/mapred_brandes.cpp -O3 -o brandes_mapper
>     $hadoop_bin --config $hadoop_config jar $hadoop_streaming -D
> mapred.task.timeout=0 -D mapred.job.name="BC_N$((num_of_node))_M$((num_of_mapper))"
> -D mapred.reduce.tasks=$num_of_reducer -input
> input_BC_N$((num_of_node))_M$((num_of_mapper)) -output $output_path -file
> brandes_mapper -file src/mslab/BC_reducer.py -file
> src/mslab/MapReduceUtil.py -file input_path -mapper "./brandes_mapper
> $input_path $num_of_node" -reducer "./BC_reducer.py"
>
> When I run this code in a shell script, i get the following errors:
>
>     Warning: $HADOOP_HOME is deprecated.
>     File: /home/hduser/Downloads/mgmf/trunk/input_path does not exist, or
> is not readable.
>     Streaming Command Failed!
>
> but the file exits at the specified path
>
>     /Downloads/mgmf/trunk/data$ ls
>     dblp_author_conf_adj.txt
>
> I have also added the input file into HDFS using
>
>     /usr/local/hadoop$ bin/hadoop dfs -copyFromLocal /source /destination
>
> Can someone help me solve this problem?
>
>
> Any help is appreciated,
> Thanks
> Prithvi
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB