Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> hadoop streaming troubles


Copy link to this message
-
hadoop streaming troubles
Hello,
I am using hadoop-0.20.2-cdh3u1.
 First question:
    can one ommit sorting in streaming (e.g. when one only sums numbers)?
 Second question:
    Why do I have to run my jobs from empty current working directory? When
I run it from my home, I get this:
    13/12/19 16:22:40 ERROR streaming.StreamJob: Error launching job , bad
input path : File
file:/home/xhancar/.mozilla/.ozilla/firefox/vab3tgqp.default/lock does not
exist.
    The path seems like total nonsense.
      Thanks,
      Pavel Hančar

P.S.
The whole thing:

xhancar@alba:~$ /packages/run.64/hadoop-0.20.2-cdh3u1/bin/hadoop jar
/packages/run.64/hadoop-0.20.2-cdh3u1/contrib/streaming/hadoop-streaming-0.20.2-cdh3u1.jar
-D stream.non.zero.exit.is.failure=false -D stream.map.input.ignoreKey=true
-D mapred.reduce.tasks=1 -libjars '' -input /user/xhancar/cxvii -output
output -mapper /home/xhancar/dp/bin/wcl.sh -file
/home/xhancar/dp/bin/wcl.sh -reducer /home/xhancar/dp/bin/sum.sh -file
/home/xhancar/dp/bin/sum.sh -inputformat
org.apache.hadoop.mapred.TextInputFormat
packageJobJar: [/home/xhancar/dp/bin/wcl.sh, /home/xhancar/dp/bin/sum.sh,
/tmp/hadoop-xhancar/hadoop-unjar1228378690975633202/] []
/tmp/streamjob8038106018846805847.jar tmpDir=null
13/12/19 16:22:40 INFO mapred.JobClient: Cleaning up the staging area
hdfs://alba:9000/tmp/hadoop-hadoopnlp/mapred/staging/xhancar/.staging/job_201312191531_0012
13/12/19 16:22:40 ERROR streaming.StreamJob: Error launching job , bad
input path : File
file:/home/xhancar/.mozilla/.ozilla/firefox/vab3tgqp.default/lock does not
exist.
Streaming Command Failed!

But:

xhancar@alba:~$ cd empty/
xhancar@alba:~/empty$ /packages/run.64/hadoop-0.20.2-cdh3u1/bin/hadoop jar
/packages/run.64/hadoop-0.20.2-cdh3u1/contrib/streaming/hadoop-streaming-0.20.2-cdh3u1.jar
-D stream.non.zero.exit.is.failure=false -D stream.map.input.ignoreKey=true
-D mapred.reduce.tasks=1 -libjars '' -input /user/xhancar/cxvii -output
output -mapper /home/xhancar/dp/bin/wcl.sh -file
/home/xhancar/dp/bin/wcl.sh -reducer /home/xhancar/dp/bin/sum.sh -file
/home/xhancar/dp/bin/sum.sh -inputformat
org.apache.hadoop.mapred.TextInputFormat
packageJobJar: [/home/xhancar/dp/bin/wcl.sh, /home/xhancar/dp/bin/sum.sh,
/tmp/hadoop-xhancar/hadoop-unjar928216275517356553/] []
/tmp/streamjob361197118805255140.jar tmpDir=null
13/12/19 16:22:53 WARN snappy.LoadSnappy: Snappy native library is available
13/12/19 16:22:53 INFO util.NativeCodeLoader: Loaded the native-hadoop
library
13/12/19 16:22:53 INFO snappy.LoadSnappy: Snappy native library loaded
13/12/19 16:22:53 INFO mapred.FileInputFormat: Total input paths to process
: 1
13/12/19 16:22:54 INFO streaming.StreamJob: getLocalDirs():
[/tmp/hadoop-xhancar/mapred/local]
13/12/19 16:22:54 INFO streaming.StreamJob: Running job:
job_201312191531_0013
13/12/19 16:22:54 INFO streaming.StreamJob: To kill this job, run:
13/12/19 16:22:54 INFO streaming.StreamJob:
/packages/run.64/hadoop-0.20.2-cdh3u1/bin/../bin/hadoop job
-Dmapred.job.tracker=alba:9001 -kill job_201312191531_0013
13/12/19 16:22:54 INFO streaming.StreamJob: Tracking URL:
http://alba.fi.muni.cz:50030/jobdetails.jsp?jobid=job_201312191531_0013
13/12/19 16:22:55 INFO streaming.StreamJob:  map 0%  reduce 0%
13/12/19 16:23:01 INFO streaming.StreamJob:  map 100%  reduce 0%
13/12/19 16:23:09 INFO streaming.StreamJob:  map 100%  reduce 33%
13/12/19 16:23:11 INFO streaming.StreamJob:  map 100%  reduce 100%
13/12/19 16:23:14 INFO streaming.StreamJob: Job complete:
job_201312191531_0013
13/12/19 16:23:14 INFO streaming.StreamJob: Output: output
xhancar@alba:~/empty$
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB