Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> Basic Question


+
Mohit Anchlia 2012-08-07, 18:26
+
Harsh J 2012-08-07, 18:33
Copy link to this message
-
Re: Basic Question
On Tue, Aug 7, 2012 at 11:33 AM, Harsh J <[EMAIL PROTECTED]> wrote:

> Each write call registers (writes) a KV pair to the output. The output
> collector does not look for similarities nor does it try to de-dupe
> it, and even if the object is the same, its value is copied so that
> doesn't matter.
>
> So you will get two KV pairs in your output - since duplication is
> allowed and is normal in several MR cases. Think of wordcount, where a
> map() call may emit lots of ("is", 1) pairs if there are multiple "is"
> in the line it processes, and can use set() calls to its benefit to
> avoid too many object creation.
Thanks!

>
> On Tue, Aug 7, 2012 at 11:56 PM, Mohit Anchlia <[EMAIL PROTECTED]>
> wrote:
> > In Mapper I often use a Global Text object and througout the map
> processing
> > I just call "set" on it. My question is, what happens if collector
> receives
> > similar byte array value. Does the last one overwrite the value in
> > collector? So if I did
> >
> > Text zip = new Text();
> > zip.set("9099");
> > collector.write(zip,value);
> > zip.set("9099");
> > collector.write(zip,value1);
> >
> > Should I expect to receive both values in reducer or just one?
>
>
>
> --
> Harsh J
>