Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> hadoop streaming troubles


Copy link to this message
-
hadoop streaming troubles
Hello,
I am using hadoop-0.20.2-cdh3u1.
 First question:
    can one ommit sorting in streaming (e.g. when one only sums numbers)?
 Second question:
    Why do I have to run my jobs from empty current working directory? When
I run it from my home, I get this:
    13/12/19 16:22:40 ERROR streaming.StreamJob: Error launching job , bad
input path : File
file:/home/xhancar/.mozilla/.ozilla/firefox/vab3tgqp.default/lock does not
exist.
    The path seems like total nonsense.
      Thanks,
      Pavel Hančar

P.S.
The whole thing:

xhancar@alba:~$ /packages/run.64/hadoop-0.20.2-cdh3u1/bin/hadoop jar
/packages/run.64/hadoop-0.20.2-cdh3u1/contrib/streaming/hadoop-streaming-0.20.2-cdh3u1.jar
-D stream.non.zero.exit.is.failure=false -D stream.map.input.ignoreKey=true
-D mapred.reduce.tasks=1 -libjars '' -input /user/xhancar/cxvii -output
output -mapper /home/xhancar/dp/bin/wcl.sh -file
/home/xhancar/dp/bin/wcl.sh -reducer /home/xhancar/dp/bin/sum.sh -file
/home/xhancar/dp/bin/sum.sh -inputformat
org.apache.hadoop.mapred.TextInputFormat
packageJobJar: [/home/xhancar/dp/bin/wcl.sh, /home/xhancar/dp/bin/sum.sh,
/tmp/hadoop-xhancar/hadoop-unjar1228378690975633202/] []
/tmp/streamjob8038106018846805847.jar tmpDir=null
13/12/19 16:22:40 INFO mapred.JobClient: Cleaning up the staging area
hdfs://alba:9000/tmp/hadoop-hadoopnlp/mapred/staging/xhancar/.staging/job_201312191531_0012
13/12/19 16:22:40 ERROR streaming.StreamJob: Error launching job , bad
input path : File
file:/home/xhancar/.mozilla/.ozilla/firefox/vab3tgqp.default/lock does not
exist.
Streaming Command Failed!

But:

xhancar@alba:~$ cd empty/
xhancar@alba:~/empty$ /packages/run.64/hadoop-0.20.2-cdh3u1/bin/hadoop jar
/packages/run.64/hadoop-0.20.2-cdh3u1/contrib/streaming/hadoop-streaming-0.20.2-cdh3u1.jar
-D stream.non.zero.exit.is.failure=false -D stream.map.input.ignoreKey=true
-D mapred.reduce.tasks=1 -libjars '' -input /user/xhancar/cxvii -output
output -mapper /home/xhancar/dp/bin/wcl.sh -file
/home/xhancar/dp/bin/wcl.sh -reducer /home/xhancar/dp/bin/sum.sh -file
/home/xhancar/dp/bin/sum.sh -inputformat
org.apache.hadoop.mapred.TextInputFormat
packageJobJar: [/home/xhancar/dp/bin/wcl.sh, /home/xhancar/dp/bin/sum.sh,
/tmp/hadoop-xhancar/hadoop-unjar928216275517356553/] []
/tmp/streamjob361197118805255140.jar tmpDir=null
13/12/19 16:22:53 WARN snappy.LoadSnappy: Snappy native library is available
13/12/19 16:22:53 INFO util.NativeCodeLoader: Loaded the native-hadoop
library
13/12/19 16:22:53 INFO snappy.LoadSnappy: Snappy native library loaded
13/12/19 16:22:53 INFO mapred.FileInputFormat: Total input paths to process
: 1
13/12/19 16:22:54 INFO streaming.StreamJob: getLocalDirs():
[/tmp/hadoop-xhancar/mapred/local]
13/12/19 16:22:54 INFO streaming.StreamJob: Running job:
job_201312191531_0013
13/12/19 16:22:54 INFO streaming.StreamJob: To kill this job, run:
13/12/19 16:22:54 INFO streaming.StreamJob:
/packages/run.64/hadoop-0.20.2-cdh3u1/bin/../bin/hadoop job
-Dmapred.job.tracker=alba:9001 -kill job_201312191531_0013
13/12/19 16:22:54 INFO streaming.StreamJob: Tracking URL:
http://alba.fi.muni.cz:50030/jobdetails.jsp?jobid=job_201312191531_0013
13/12/19 16:22:55 INFO streaming.StreamJob:  map 0%  reduce 0%
13/12/19 16:23:01 INFO streaming.StreamJob:  map 100%  reduce 0%
13/12/19 16:23:09 INFO streaming.StreamJob:  map 100%  reduce 33%
13/12/19 16:23:11 INFO streaming.StreamJob:  map 100%  reduce 100%
13/12/19 16:23:14 INFO streaming.StreamJob: Job complete:
job_201312191531_0013
13/12/19 16:23:14 INFO streaming.StreamJob: Output: output
xhancar@alba:~/empty$