Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> How to LIMIT a relation by percentage


Copy link to this message
-
Re: How to LIMIT a relation by percentage
Hi Ruslan -- no need to write your own UDF.  There is a built-in
function TOP() which will extract for you the top N tuples of a
relation, where N is a configurable parameter.  Take a look at:

http://pig.apache.org/docs/r0.9.0/func.html#topx

Norbert

On Thu, Sep 8, 2011 at 9:13 AM, Ruslan Al-Fakikh
<[EMAIL PROTECTED]> wrote:
> Hey guys,
>
> How can I LIMIT a relation by percentage?
> What I need is to sort a relation by a numeric column and then take
> top 5% of tuples.
> As far as I understand I cannot use an expression in the LIMIT
> operator. Do I have to write my own UDF? What type of UDF should I use
> then?
>
> --
> Best Regards,
> Ruslan Al-Fakikh
>