Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> Basic Question


+
Mohit Anchlia 2012-08-07, 18:26
+
Harsh J 2012-08-07, 18:33
Copy link to this message
-
Re: Basic Question
On Tue, Aug 7, 2012 at 11:33 AM, Harsh J <[EMAIL PROTECTED]> wrote:

> Each write call registers (writes) a KV pair to the output. The output
> collector does not look for similarities nor does it try to de-dupe
> it, and even if the object is the same, its value is copied so that
> doesn't matter.
>
> So you will get two KV pairs in your output - since duplication is
> allowed and is normal in several MR cases. Think of wordcount, where a
> map() call may emit lots of ("is", 1) pairs if there are multiple "is"
> in the line it processes, and can use set() calls to its benefit to
> avoid too many object creation.
Thanks!

>
> On Tue, Aug 7, 2012 at 11:56 PM, Mohit Anchlia <[EMAIL PROTECTED]>
> wrote:
> > In Mapper I often use a Global Text object and througout the map
> processing
> > I just call "set" on it. My question is, what happens if collector
> receives
> > similar byte array value. Does the last one overwrite the value in
> > collector? So if I did
> >
> > Text zip = new Text();
> > zip.set("9099");
> > collector.write(zip,value);
> > zip.set("9099");
> > collector.write(zip,value1);
> >
> > Should I expect to receive both values in reducer or just one?
>
>
>
> --
> Harsh J
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB