Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> How to LIMIT a relation by percentage

Copy link to this message
Re: How to LIMIT a relation by percentage
Hi Ruslan -- no need to write your own UDF.  There is a built-in
function TOP() which will extract for you the top N tuples of a
relation, where N is a configurable parameter.  Take a look at:



On Thu, Sep 8, 2011 at 9:13 AM, Ruslan Al-Fakikh
> Hey guys,
> How can I LIMIT a relation by percentage?
> What I need is to sort a relation by a numeric column and then take
> top 5% of tuples.
> As far as I understand I cannot use an expression in the LIMIT
> operator. Do I have to write my own UDF? What type of UDF should I use
> then?
> --
> Best Regards,
> Ruslan Al-Fakikh