Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> DISTINCT and paritioner


Copy link to this message
-
Re: DISTINCT and paritioner
You're correct.  It looks like an optimization was put in to make distinct use a special partitioner which prevents the user from setting the partitioner.  Could you file a JIRA against the docs so we can get that fixed?

Alan.

On Jul 17, 2013, at 11:27 AM, William Oberman wrote:

> The docs say DISTINCT can take a custom partitioner.  How does that work?
> What is "K" and "V"?
> I'm having some doubts the docs are correct.  I wrote a test partitioner
> that does a System.out of K and V.  I then wrote simple scripts to do JOIN,
> GROUP and DISTINCT.  For JOIN and GROUP I see my system.outs(*).  For
> DISTINCT, I see nothing....
>
> Using 0.11.1.
>
> will