Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Hadoop-streaming with a c binary executable as a mapper


Copy link to this message
-
Re: Hadoop-streaming with a c binary executable as a mapper
Your executable needs to read lines from standard in. Try setting your mapper like this:

> -mapper "/data/yehdego/hadoop-0.20.2/pknotsRG -"

If that doesn't work, you may need to execute your C program from a shell script. The -I added to the command line says read from STDIN.

-Joey
On Jul 22, 2011, at 10:41, Daniel Yehdego <[EMAIL PROTECTED]> wrote:

> Hi,
>
> I using hadoop-streaming for parallelizing a big RNA data. I am using a
> c binary executable program called pknotsRG as my mapper. My command to
> run the job looks like:
>
> HADOOP_HOME$  bin/hadoop
> jar /data/yehdego/hadoop-0.20.2/hadoop-0.20.2-streaming.jar
> -mapper /data/yehdego/hadoop-0.20.2/pknotsRG
> -file /data/yehdego/hadoop-0.20.2/pknotsRG
> -input /user/yehdego/RF00028_B.bpseqL3G5_seg_Centered_Method.txt
> -output /user/yehdego/RF-out
> -reducer NONE
> -verbose
>
> and I keep getting the following error messages:
>
> java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess
> failed with code 1
>    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:311)
>    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:545)
>    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:132)
>    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
>    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
>    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
>    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
>    at org.apache.hadoop.mapred.Child.main(Child.java:170)
>
> FYI: I am inputing a file with lines of sequences and the mapper is expected to take each line
> and execute and predict their 2D secondary structure. I tried the executable locally and it worked.
>
> [yehdego@bulgaria hadoop-0.20.2]$ ./pknotsRG
> RF00028_B.bpseqL3G5_seg_Centered_Method.txt
>
> AUGACUCUCUAAAUUGCAAAAUUUACCUUUGGAGGGAAAAGUUAUCAGGCCUGCACCUGAUAGCUAGUCUUUAAACCAAUAGAUUGCAUCGGUUUAAUA
> ....(((((((((..............)))))))))...((((((((((......))))))))))[[[[[.{{{{{{...]]]]].....}}}}}}...  
> GCAAGACCGUCAAAUUGCGGGAAAAGGGU
> ......((((......)))).........  
> CAACAGCCGUUCAGUACCAAGUCUCAGGGGA
> ......((.((.((........)).)).)).  
> AACUUUGAGAUGGCCUUGCAAAGGAUAUGGUAAUAAGCUGACGGACAGGGUCCUAACCACGCAGCCAAGUCCUAAGUCAACAUUU
> ......[[[.{{{{]]]....(((((.((((.....((((..((((...))))....)).)).)))).)))))..}}}}......  
> CGGUGUUGAUAUGGAUGCAGUUCACAGACUAAAUGUCGGUCGGGGAAGAAUAGGUAUUCUUCUCAUAAGAUAUAGUCGGACCUCUCCUUAAUGGGAGCU
> .(((.......(((((...)))))..(((((..((((.....(((((((((....)))))))))....)))))))))..))).(((((....)))))..  
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB