Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> Error while using the Hadoop Streaming


Copy link to this message
-
Re: Error while using the Hadoop Streaming
Hi,

In your first mail you were using "/usr/bin/python" binary file just after
"- mapper", I don't think we need python executable to run this example.

Make sure that you are using correct path of you files "mapper.py and
reducer.py"  while executing.
~Thanks

On Fri, May 24, 2013 at 11:31 PM, Adamantios Corais <
[EMAIL PROTECTED]> wrote:

> Hi,
>
> Thanks a lot for your response.
>
> Unfortunately, I run into the same problem though.
>
> What do you mean by "python binary"? This is what I have in the very first
> line of both scripts: #!/usr/bin/python
>
> Any ideas?
>
>
> On Fri, May 24, 2013 at 7:41 PM, Jitendra Yadav <
> [EMAIL PROTECTED]> wrote:
>
>> Hi,
>>
>> I have run Michael's python map reduce example several times without any
>> issue.
>>
>> I think this issue is related to your file path 'mapper.py'.  you are
>> using python binary?
>>
>> try this,
>>
>> hadoop jar
>> /home/yyy/Dropbox/Private/xxx/Projects/task_week_22/hadoop-streaming-1.1.2.jar
>> \
>>  -input /user/yyy/20417-8.txt \
>> -output /user/yyy/output \
>>  -file /home/yyy/Dropbox/Private/xxx/Projects/task_week_22/mapper.py \
>> -mapper /home/yyy/Dropbox/Private/xxx/Projects/task_week_22/mapper.py \
>>  -file /home/yyy/Dropbox/Private/xxx/Projects/task_week_22/reducer.py \
>> -reducer /home/yyy/Dropbox/Private/xxx/Projects/task_week_22/reducer.py
>>
>>
>> Thanks~
>>
>> On Fri, May 24, 2013 at 10:12 PM, Adamantios Corais <
>> [EMAIL PROTECTED]> wrote:
>>
>>> I tried this nice example:
>>> http://www.michael-noll.com/tutorials/writing-an-hadoop-mapreduce-program-in-python/
>>>
>>> The python scripts work pretty fine from my laptop (through terminal),
>>> but they don't when I execute them on the CDH3 (Pseudo-Distributed Mode).
>>>
>>> Any ideas?
>>>
>>> hadoop jar
>>> /home/yyy/Dropbox/Private/xxx/Projects/task_week_22/hadoop-streaming-1.1.2.jar
>>> \
>>> -input /user/yyy/20417-8.txt \
>>> -output /user/yyy/output \
>>> -file /usr/bin/python
>>> /home/yyy/Dropbox/Private/xxx/Projects/task_week_22/mapper.py \
>>> -mapper /usr/bin/python
>>> /home/yyy/Dropbox/Private/xxx/Projects/task_week_22/mapper.py \
>>> -file /home/yyy/Dropbox/Private/xxx/Projects/task_week_22/reducer.py \
>>> -reducer /home/yyy/Dropbox/Private/xxx/Projects/task_week_22/reducer.py
>>>
>>> -----------------------
>>>
>>> 2013-05-24 18:21:12,232 INFO org.apache.hadoop.util.NativeCodeLoader:
>>> Loaded the native-hadoop library
>>> 2013-05-24 18:21:12,569 INFO
>>> org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating
>>> symlink:
>>> /var/lib/hadoop-0.20/cache/mapred/mapred/local/taskTracker/yyy/jobcache/job_201305160832_0020/jars/job.jar
>>> <-
>>> /var/lib/hadoop-0.20/cache/mapred/mapred/local/taskTracker/yyy/jobcache/job_201305160832_0020/attempt_201305160832_0020_m_000000_0/work/job.jar
>>> 2013-05-24 18:21:12,586 INFO
>>> org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating
>>> symlink:
>>> /var/lib/hadoop-0.20/cache/mapred/mapred/local/taskTracker/yyy/jobcache/job_201305160832_0020/jars/.job.jar.crc
>>> <-
>>> /var/lib/hadoop-0.20/cache/mapred/mapred/local/taskTracker/yyy/jobcache/job_201305160832_0020/attempt_201305160832_0020_m_000000_0/work/.job.jar.crc
>>> 2013-05-24 18:21:12,717 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
>>> Initializing JVM Metrics with processName=MAP, sessionId>>> 2013-05-24 18:21:13,062 INFO org.apache.hadoop.util.ProcessTree: setsid
>>> exited with exit code 0
>>> 2013-05-24 18:21:13,087 INFO org.apache.hadoop.mapred.Task:  Using
>>> ResourceCalculatorPlugin :
>>> org.apache.hadoop.util.LinuxResourceCalculatorPlugin@1358f03
>>> 2013-05-24 18:21:13,452 WARN
>>> org.apache.hadoop.io.compress.snappy.LoadSnappy: Snappy native library is
>>> available
>>> 2013-05-24 18:21:13,452 INFO
>>> org.apache.hadoop.io.compress.snappy.LoadSnappy: Snappy native library
>>> loaded
>>> 2013-05-24 18:21:13,464 INFO org.apache.hadoop.mapred.MapTask:
>>> numReduceTasks: 1
>>> 2013-05-24 18:21:13,477 INFO org.apache.hadoop.mapred.MapTask:
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB