Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive, mail # user - ROW_NUMBER() equivalent in Hive


+
kumar mr 2013-02-21, 07:33
+
Owen OMalley 2013-02-21, 16:08
+
Ashutosh Chauhan 2013-02-21, 16:17
+
Stephen Boesch 2013-02-21, 20:17
+
Ashutosh Chauhan 2013-02-22, 01:44
Copy link to this message
-
Re: ROW_NUMBER() equivalent in Hive
kumar mr 2013-02-21, 19:12

Owen,
it's for entire table. the sample TD query looks like below,

SELECT
        columnA
        ,columnB
        , columnC
        , columnD
        , columnX
        ,ROW_NUMBER() OVER (PARTITION BY columnA, columnB, columnC ORDER BY columnX DESC, columnY DESC) AS rank
FROM table a
Regards,
Kumar

-----Original Message-----
From: Owen O'Malley <[EMAIL PROTECTED]>
To: user <[EMAIL PROTECTED]>
Sent: Thu, Feb 21, 2013 8:08 am
Subject: Re: ROW_NUMBER() equivalent in Hive
What are the semantics for ROW_NUMBER? Is it a global row number? Per a partition? Per a bucket?
-- Owen
On Wed, Feb 20, 2013 at 11:33 PM, kumar mr <[EMAIL PROTECTED]> wrote:

Hi,
This is Kumar, and this is my first question in this group.
I have a requirement to implement ROW_NUMBER() from Teradata in Hive where partitioning happens on multiple columns along with multiple column ordering.
It can be easily implemented in Hadoop MR, but I have to do in Hive. By doing in UDF can assign same rank to grouping key considering dataset is small, but ordering need to be done in prior step.
Can we do this in lot simpler way?
Thanks in advance.
Regards,
Kumar