kumar mr 2013-02-21, 07:33
Owen OMalley 2013-02-21, 16:08
Ashutosh Chauhan 2013-02-21, 16:17
Stephen Boesch 2013-02-21, 20:17
Ashutosh Chauhan 2013-02-22, 01:44
-Re: ROW_NUMBER() equivalent in Hive
kumar mr 2013-02-21, 19:12
it's for entire table. the sample TD query looks like below,
,ROW_NUMBER() OVER (PARTITION BY columnA, columnB, columnC ORDER BY columnX DESC, columnY DESC) AS rank
FROM table a
From: Owen O'Malley <[EMAIL PROTECTED]>
To: user <[EMAIL PROTECTED]>
Sent: Thu, Feb 21, 2013 8:08 am
Subject: Re: ROW_NUMBER() equivalent in Hive
What are the semantics for ROW_NUMBER? Is it a global row number? Per a partition? Per a bucket?
On Wed, Feb 20, 2013 at 11:33 PM, kumar mr <[EMAIL PROTECTED]> wrote:
This is Kumar, and this is my first question in this group.
I have a requirement to implement ROW_NUMBER() from Teradata in Hive where partitioning happens on multiple columns along with multiple column ordering.
It can be easily implemented in Hadoop MR, but I have to do in Hive. By doing in UDF can assign same rank to grouping key considering dataset is small, but ordering need to be done in prior step.
Can we do this in lot simpler way?
Thanks in advance.