Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> question about using java in streaming mode


+
Siddhartha Jonnalagadda 2011-06-05, 05:16
Copy link to this message
-
Re: question about using java in streaming mode
Why are using Java in streming mode instead use the native Mapper/Reducer code?
Can you show to us the JobTracker's logs?

Regards
----- Mensaje original -----
De: "Siddhartha Jonnalagadda" <[EMAIL PROTECTED]>
Para: [EMAIL PROTECTED]
Enviados: Domingo, 5 de Junio 2011 7:16:08 GMT +01:00 Amsterdam / Berl��n / Berna / Roma / Estocolmo / Viena
Asunto: question about using java in streaming mode

Hi,
I was able use streaming in hadoop using python for the wordcount program, but created a Mapper and Reducer in Java since all my code is currently in Java.
I first tried this:
echo “foo foo quux labs foo bar quux” |java -cp ~/dummy.jar WCMapper | sort | java -cp ~/dummy.jar WCReducer

It gave the correct output:
labs 1
foo 3
bar 1
quux 2

Then, I installed a single-node cluster in hadoop and tried this: hadoop jar contrib/streaming/hadoop-streaming-0.20.203.0.jar -mapper “java -cp ~/dummy.jar WCMapper” -reducer “java -cp ~/dummy.jar WCReducer” -input gutenberg/* -output gutenberg-output -file dummy.jar (by tailoring the python command)

This is the error:
hadoop@siddhartha-laptop:/usr/local/hadoop$ hadoop jar contrib/streaming/hadoop-streaming-0.20.203.0.jar -mapper “java -cp ~/dummy.jar WCMapper” -reducer “java -cp ~/dummy.jar WCReducer” -input gutenberg/* -output gutenberg-output -file dummy.jar
packageJobJar: [dummy.jar, /app/hadoop/tmp/hadoop-unjar5573454211442575176/] [] /tmp/streamjob6721719460213928092.jar tmpDir=null
11/06/04 20:47:15 INFO mapred.FileInputFormat: Total input paths to process : 3
11/06/04 20:47:15 INFO streaming.StreamJob: getLocalDirs(): [/app/hadoop/tmp/mapred/local]
11/06/04 20:47:15 INFO streaming.StreamJob: Running job: job_201106031901_0039
11/06/04 20:47:15 INFO streaming.StreamJob: To kill this job, run:
11/06/04 20:47:15 INFO streaming.StreamJob: /usr/local/hadoop/bin/../bin/hadoop job -Dmapred.job.tracker=localhost:54311 -kill job_201106031901_0039
11/06/04 20:47:15 INFO streaming.StreamJob: Tracking URL: http://localhost:50030/jobdetails.jsp?jobid=job_201106031901_0039
11/06/04 20:47:16 INFO streaming.StreamJob: map 0% reduce 0%
11/06/04 20:48:00 INFO streaming.StreamJob: map 100% reduce 100%
11/06/04 20:48:00 INFO streaming.StreamJob: To kill this job, run:
11/06/04 20:48:00 INFO streaming.StreamJob: /usr/local/hadoop/bin/../bin/hadoop job -Dmapred.job.tracker=localhost:54311 -kill job_201106031901_0039
11/06/04 20:48:00 INFO streaming.StreamJob: Tracking URL: http://localhost:50030/jobdetails.jsp?jobid=job_201106031901_0039
11/06/04 20:48:00 ERROR streaming.StreamJob: Job not successful. Error: NA
11/06/04 20:48:00 INFO streaming.StreamJob: killJob…
Streaming Job Failed!

Any advice?
Sincerely,
Siddhartha Jonnalagadda,
Text mining Researcher, Lnx Research, LLC, Orange, CA
sjonnalagadda.wordpress.com

Confidentiality Notice:

This e-mail message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.

--
Marcos Luís Ortíz Valmaseda
 Software Engineer (Large-Scaled Distributed Systems)
http://marcosluis2186.posterous.com
+
Siddhartha Jonnalagadda 2011-06-05, 20:01
+
Siddhartha Jonnalagadda 2011-06-05, 20:18
+
Marcos Ortiz 2011-06-05, 21:52
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB