Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS >> mail # user >> Error while using the Hadoop Streaming


+
Adamantios Corais 2013-05-24, 16:42
Copy link to this message
-
Re: Error while using the Hadoop Streaming
Hi,

I have run Michael's python map reduce example several times without any
issue.

I think this issue is related to your file path 'mapper.py'.  you are using
python binary?

try this,

hadoop jar
/home/yyy/Dropbox/Private/xxx/Projects/task_week_22/hadoop-streaming-1.1.2.jar
\
 -input /user/yyy/20417-8.txt \
-output /user/yyy/output \
-file /home/yyy/Dropbox/Private/xxx/Projects/task_week_22/mapper.py \
-mapper /home/yyy/Dropbox/Private/xxx/Projects/task_week_22/mapper.py \
 -file /home/yyy/Dropbox/Private/xxx/Projects/task_week_22/reducer.py \
-reducer /home/yyy/Dropbox/Private/xxx/Projects/task_week_22/reducer.py
Thanks~

On Fri, May 24, 2013 at 10:12 PM, Adamantios Corais <
[EMAIL PROTECTED]> wrote:

> I tried this nice example:
> http://www.michael-noll.com/tutorials/writing-an-hadoop-mapreduce-program-in-python/
>
> The python scripts work pretty fine from my laptop (through terminal), but
> they don't when I execute them on the CDH3 (Pseudo-Distributed Mode).
>
> Any ideas?
>
> hadoop jar
> /home/yyy/Dropbox/Private/xxx/Projects/task_week_22/hadoop-streaming-1.1.2.jar
> \
> -input /user/yyy/20417-8.txt \
> -output /user/yyy/output \
> -file /usr/bin/python
> /home/yyy/Dropbox/Private/xxx/Projects/task_week_22/mapper.py \
> -mapper /usr/bin/python
> /home/yyy/Dropbox/Private/xxx/Projects/task_week_22/mapper.py \
> -file /home/yyy/Dropbox/Private/xxx/Projects/task_week_22/reducer.py \
> -reducer /home/yyy/Dropbox/Private/xxx/Projects/task_week_22/reducer.py
>
> -----------------------
>
> 2013-05-24 18:21:12,232 INFO org.apache.hadoop.util.NativeCodeLoader:
> Loaded the native-hadoop library
> 2013-05-24 18:21:12,569 INFO
> org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating
> symlink:
> /var/lib/hadoop-0.20/cache/mapred/mapred/local/taskTracker/yyy/jobcache/job_201305160832_0020/jars/job.jar
> <-
> /var/lib/hadoop-0.20/cache/mapred/mapred/local/taskTracker/yyy/jobcache/job_201305160832_0020/attempt_201305160832_0020_m_000000_0/work/job.jar
> 2013-05-24 18:21:12,586 INFO
> org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating
> symlink:
> /var/lib/hadoop-0.20/cache/mapred/mapred/local/taskTracker/yyy/jobcache/job_201305160832_0020/jars/.job.jar.crc
> <-
> /var/lib/hadoop-0.20/cache/mapred/mapred/local/taskTracker/yyy/jobcache/job_201305160832_0020/attempt_201305160832_0020_m_000000_0/work/.job.jar.crc
> 2013-05-24 18:21:12,717 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
> Initializing JVM Metrics with processName=MAP, sessionId> 2013-05-24 18:21:13,062 INFO org.apache.hadoop.util.ProcessTree: setsid
> exited with exit code 0
> 2013-05-24 18:21:13,087 INFO org.apache.hadoop.mapred.Task:  Using
> ResourceCalculatorPlugin :
> org.apache.hadoop.util.LinuxResourceCalculatorPlugin@1358f03
> 2013-05-24 18:21:13,452 WARN
> org.apache.hadoop.io.compress.snappy.LoadSnappy: Snappy native library is
> available
> 2013-05-24 18:21:13,452 INFO
> org.apache.hadoop.io.compress.snappy.LoadSnappy: Snappy native library
> loaded
> 2013-05-24 18:21:13,464 INFO org.apache.hadoop.mapred.MapTask:
> numReduceTasks: 1
> 2013-05-24 18:21:13,477 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb
> = 100
> 2013-05-24 18:21:13,635 INFO org.apache.hadoop.mapred.MapTask: data buffer
> = 79691776/99614720
> 2013-05-24 18:21:13,635 INFO org.apache.hadoop.mapred.MapTask: record
> buffer = 262144/327680
> 2013-05-24 18:21:13,724 INFO org.mortbay.log: Logging to
> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via
> org.mortbay.log.Slf4jLog
> 2013-05-24 18:21:13,733 INFO org.apache.hadoop.streaming.PipeMapRed:
> PipeMapRed exec [mapper.py]
> 2013-05-24 18:21:13,783 ERROR org.apache.hadoop.streaming.PipeMapRed:
> configuration exception
> java.io.IOException: Cannot run program "mapper.py": java.io.IOException:
> error=2, No such file or directory
>     at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
>     at
> org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:214)
>     at org.apache.hadoop.streaming.PipeMapper.configure(PipeMapper.java:66)
+
Adamantios Corais 2013-05-24, 18:01
+
Jitendra Yadav 2013-05-24, 18:10
+
Adamantios Corais 2013-05-24, 18:18
+
Pramod N 2013-05-25, 17:32
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB