Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # dev >> hadoop streaming using java as mapper & reducer


Copy link to this message
-
Re: hadoop streaming using java as mapper & reducer
I'm confused.  First, most people don't use streaming in conjunction with Java since Hadoop supports Java directly...although I think you were saying in your parenthetical comment at the top of your post that you may have a legitimate reason for doing that.  Then, in your code, you appear to be creating a "classic" MapReduce program, not a streaming one.  I say this because your Mapper is a typical mapper, extending the Mapper from the Hadoop API.  I would expect a streaming Mapper to be a top-level class, not a derivation from the API.  Furthermore, it would gather its inputs not through the overridden map() method but rather from stdin.  Likewise, it would send output to stdout.

Am I completely misunderstanding your situation?

On May 27, 2012, at 23:50 , HU Wenjing A wrote:

> Hi all,
>
> I am a new learner of hadoop, and recently I want to use hadoop streaming to run java program as mapper and reducer.
> (because I want to use hadoop streaming to transplant some existing java programs to process xml file).
>  To have a try, first I use the hadoop wordcount example (as follows):
>
>     Countm.java:
>     import java.io.IOException;
> import java.util.StringTokenizer;
> import org.apache.hadoop.io.IntWritable;
> import org.apache.hadoop.io.LongWritable;
> import org.apache.hadoop.io.Text;
> import org.apache.hadoop.mapreduce.Mapper;
>
> public class countm extends Mapper<LongWritable, Text, Text, IntWritable> {
>         private final static IntWritable one = new IntWritable(1);
>         private Text word = new Text();
>
>         public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
>           String line = value.toString();
>           StringTokenizer tokenizer = new StringTokenizer(line);
>           while (tokenizer.hasMoreTokens()) {
>                word.set(tokenizer.nextToken());
>                context.write(word, one);
>                }
>             }
> }
________________________________________________________________________________
Keith Wiley     [EMAIL PROTECTED]     keithwiley.com    music.keithwiley.com

"And what if we picked the wrong religion?  Every week, we're just making God
madder and madder!"
                                           --  Homer Simpson
________________________________________________________________________________
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB