|
|
Mohit Anchlia 2012-08-07, 18:26
In Mapper I often use a Global Text object and througout the map processing I just call "set" on it. My question is, what happens if collector receives similar byte array value. Does the last one overwrite the value in collector? So if I did
Text zip = new Text(); zip.set("9099"); collector.write(zip,value); zip.set("9099"); collector.write(zip,value1);
Should I expect to receive both values in reducer or just one?
+
Mohit Anchlia 2012-08-07, 18:26
Harsh J 2012-08-07, 18:33
Each write call registers (writes) a KV pair to the output. The output collector does not look for similarities nor does it try to de-dupe it, and even if the object is the same, its value is copied so that doesn't matter.
So you will get two KV pairs in your output - since duplication is allowed and is normal in several MR cases. Think of wordcount, where a map() call may emit lots of ("is", 1) pairs if there are multiple "is" in the line it processes, and can use set() calls to its benefit to avoid too many object creation.
On Tue, Aug 7, 2012 at 11:56 PM, Mohit Anchlia <[EMAIL PROTECTED]> wrote: > In Mapper I often use a Global Text object and througout the map processing > I just call "set" on it. My question is, what happens if collector receives > similar byte array value. Does the last one overwrite the value in > collector? So if I did > > Text zip = new Text(); > zip.set("9099"); > collector.write(zip,value); > zip.set("9099"); > collector.write(zip,value1); > > Should I expect to receive both values in reducer or just one?
-- Harsh J
+
Harsh J 2012-08-07, 18:33
Mohit Anchlia 2012-08-07, 18:45
On Tue, Aug 7, 2012 at 11:33 AM, Harsh J <[EMAIL PROTECTED]> wrote:
> Each write call registers (writes) a KV pair to the output. The output > collector does not look for similarities nor does it try to de-dupe > it, and even if the object is the same, its value is copied so that > doesn't matter. > > So you will get two KV pairs in your output - since duplication is > allowed and is normal in several MR cases. Think of wordcount, where a > map() call may emit lots of ("is", 1) pairs if there are multiple "is" > in the line it processes, and can use set() calls to its benefit to > avoid too many object creation. Thanks!
> > On Tue, Aug 7, 2012 at 11:56 PM, Mohit Anchlia <[EMAIL PROTECTED]> > wrote: > > In Mapper I often use a Global Text object and througout the map > processing > > I just call "set" on it. My question is, what happens if collector > receives > > similar byte array value. Does the last one overwrite the value in > > collector? So if I did > > > > Text zip = new Text(); > > zip.set("9099"); > > collector.write(zip,value); > > zip.set("9099"); > > collector.write(zip,value1); > > > > Should I expect to receive both values in reducer or just one? > > > > -- > Harsh J >
+
Mohit Anchlia 2012-08-07, 18:45
|
|