Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> the same key in different reducers


+
Oleg Ruchovets 2010-06-09, 08:17
+
Ted Yu 2010-06-09, 21:06
+
James Seigel 2010-06-09, 21:40
+
Ted Yu 2010-06-09, 21:43
+
Alex Kozlov 2010-06-09, 22:15
+
Owen OMalley 2010-06-10, 02:30
Copy link to this message
-
Re: the same key in different reducers
Hi and thank you for the answers. I didn't check the email and now I see 7
answers. It is really great.

  Let me explain in more details why I am asking so strange question :-)

As I wrote before I write to HBase using Hadoop Job. Actually the writing
process executes in reducers part of HADOOP job.
Assuming that I have 3 reducers (all of them writes to HBase) and suppose 1
reducer and 3 reducer has the same key.
In this case I need to check: does HBase already contains such key ( it
required select operation from HBase). If yes I have to merge already
inserted record and after that writes it back to HBase. BUT in my case
information organized in such way that I have no problem with the same keys.
So I can save expensive HBase select operation , meaning using only insert
operations. But in order to use only insert operation I need to know that
every and every reducer have unique output key (  K3 is unique output key
for every and every reducer)

input: InputFormat<K1,V1>
mapper: Mapper<K1,V1,K2,V2>
combiner: Reducer<K2,V2,K2,V2>
reducer: Reducer<K2,V2,K3,V3>
output: RecordWriter<K3,V3>

On Thu, Jun 10, 2010 at 12:40 AM, James Seigel <[EMAIL PROTECTED]> wrote:

> Oleg,
>
> Are you wanting to have them in different reducers?  If so then you can
> write a Comparable object to make that happen.
>
> If you want them to be on the same reducer, then that is what hadoop will
> do.
>
> :)
>
>
> On 2010-06-09, at 3:06 PM, Ted Yu wrote:
>
> > Can you disclose more about how K3 is generated.
> > From your description below, it is possible.
> >
> > On Wed, Jun 9, 2010 at 1:17 AM, Oleg Ruchovets <[EMAIL PROTECTED]>
> wrote:
> >
> >> Hi ,
> >> My hadoop job writes results of map/reduce to HBase.
> >> I have 3 reducers.
> >>
> >> Here is a sequence of input and output parameters for Mapper , Combiner
> and
> >> Reducer
> >>   *input: InputFormat<K1,V1>
> >>   mapper: Mapper<K1,V1,K2,V2>
> >>   combiner: Reducer<K2,V2,K2,V2>
> >>   reducer: Reducer<K2,V2,K3,V3>
> >>   output: RecordWriter<K3,V3>
> >>
> >> *My question:
> >> Is it possible that more than one reducer has the same output key K3.
> >> Meaning in case I have 3 reducers is it possible that
> >> reducer1    K3 -* 1* , V3 [1,2,3]
> >> reducer2    K3 - 2 , V3 [5,6,9]
> >> reducer3    K3 - *1* , V3 [10,15,22]
> >>
> >> As you can see reducer1 has K3 - 1 and reducer3 has K3 - 1.
> >> So is that case possible or every and every reducer has unique output
> key?
> >>
> >> Thanks in advance
> >> Oleg.
> >>
>
>
+
Owen OMalley 2010-06-09, 21:22
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB