Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - MapReduce: Reducers partitions.


Copy link to this message
-
Re: MapReduce: Reducers partitions.
Jean-Marc Spaggiari 2013-04-10, 14:03
Thanks Ted.

It's exactly where I was looking at now. I was close. I will take a deeper
look.

Thanks Nitin for the link. I will read that too.

JM

2013/4/10 Nitin Pawar <[EMAIL PROTECTED]>

> To add what Ted said,
>
> the same discussion happened on the question Jean asked
>
> https://issues.apache.org/jira/browse/HBASE-1287
>
>
> On Wed, Apr 10, 2013 at 7:28 PM, Ted Yu <[EMAIL PROTECTED]> wrote:
>
> > Jean-Marc:
> > Take a look at HRegionPartitioner which is in both mapred and mapreduce
> > packages:
> >
> >  * This is used to partition the output keys into groups of keys.
> >
> >  * Keys are grouped according to the regions that currently exist
> >
> >  * so that each reducer fills a single region so load is distributed.
> >
> > Cheers
> >
> > On Wed, Apr 10, 2013 at 6:54 AM, Jean-Marc Spaggiari <
> > [EMAIL PROTECTED]> wrote:
> >
> > > Hi Nitin,
> > >
> > > You got my question correctly.
> > >
> > > However, I'm wondering how it's working when it's done into HBase. Do
> > > we have defaults partionners so we have the same garantee that records
> > > mapping to one key go to the same reducer. Or do we have to implement
> > > this one our own.
> > >
> > > JM
> > >
> > > 2013/4/10 Nitin Pawar <[EMAIL PROTECTED]>:
> > > > I hope i understood what you are asking is this . If not then pardon
> me
> > > :)
> > > > from the hadoop developer handbook few lines
> > > >
> > > > The*Partitioner* class determines which partition a given (key,
> value)
> > > pair
> > > > will go to. The default partitioner computes a hash value for the key
> > and
> > > > assigns the partition based on this result. It garantees that all the
> > > > records mapping to one key go to same reducer
> > > >
> > > > You can write your custom partitioner as well
> > > > here is the link :
> > > > http://developer.yahoo.com/hadoop/tutorial/module5.html#partitioning
> > > >
> > > >
> > > >
> > > >
> > > > On Wed, Apr 10, 2013 at 6:19 PM, Jean-Marc Spaggiari <
> > > > [EMAIL PROTECTED]> wrote:
> > > >
> > > >> Hi,
> > > >>
> > > >> quick question. How are the data from the map tasks partitionned for
> > > >> the reducers?
> > > >>
> > > >> If there is 1 reducer, it's easy, but if there is more, are all they
> > > >> same keys garanteed to end on the same reducer? Or not necessary?
>  If
> > > >> they are not, how can we provide a partionning function?
> > > >>
> > > >> Thanks,
> > > >>
> > > >> JM
> > > >>
> > > >
> > > >
> > > >
> > > > --
> > > > Nitin Pawar
> > >
> >
>
>
>
> --
> Nitin Pawar
>