Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> MapReduce Tutorial tweak


Copy link to this message
-
Re: MapReduce Tutorial tweak
As far as I undersstand, StringTokenizer.nextToken returns Java String type
object which does not implement the required Writable and Comparable
interfaces needed to Hadoop Mapreduce serialization and transport. The Text
class does that and is compatible and thus that is why that is being used
to wrap Java String and pass it on.

Regards,
Shahab
On Tue, Aug 27, 2013 at 11:16 AM, Andrew Pennebaker
<[EMAIL PROTECTED]>wrote:

> In https://hadoop.apache.org/docs/stable/mapred_tutorial.html#Source+Code,
> line 16 declares:
>
> private Text word = new Text();
>
> ...
>
> But only lines 22 and 23 use this, and only to pass the value along to
> output:
>
> word.set(tokenizer.nextToken());
> output.collect(word, one);
>
> Wouldn't this be better expressed as:
>
> (no private Text word)
>
> ...
>
> output.collect(tokenizer.nextToken(), one);
>
> ?
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB