Ruslan Al-Fakikh 2011-09-08, 13:13
Norbert Burger 2011-09-08, 13:42
Dmitriy Ryaboy 2011-09-08, 16:19
Hi Dmitriy -- great info, thanks.
On Thu, Sep 8, 2011 at 12:19 PM, Dmitriy Ryaboy <[EMAIL PROTECTED]> wrote:
> You could also do it with TOP as Norbert suggests, but that has a bit of
> extra cost due to the sort TOP does.
Just for my understanding, doesn't the ORDER BY in the PIG-1926
example impose the same sort cost? Seems that you'd have pay for a
sort as long as the requirement is top N.
> On Thu, Sep 8, 2011 at 6:42 AM, Norbert Burger <[EMAIL PROTECTED]>wrote:
>> Hi Ruslan -- no need to write your own UDF. There is a built-in
>> function TOP() which will extract for you the top N tuples of a
>> relation, where N is a configurable parameter. Take a look at:
>> On Thu, Sep 8, 2011 at 9:13 AM, Ruslan Al-Fakikh
>> <[EMAIL PROTECTED]> wrote:
>> > Hey guys,
>> > How can I LIMIT a relation by percentage?
>> > What I need is to sort a relation by a numeric column and then take
>> > top 5% of tuples.
>> > As far as I understand I cannot use an expression in the LIMIT
>> > operator. Do I have to write my own UDF? What type of UDF should I use
>> > then?
>> > --
>> > Best Regards,
>> > Ruslan Al-Fakikh
Ruslan Al-Fakikh 2011-09-08, 19:46
Dmitriy Ryaboy 2011-09-09, 00:43
Ruslan Al-Fakikh 2011-09-09, 11:20
Dmitriy Ryaboy 2011-09-09, 16:19
Ruslan Al-Fakikh 2011-09-12, 09:46
Dmitriy Ryaboy 2011-09-09, 00:45