Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> ROW_NUMBER() equivalent in Hive


Copy link to this message
-
Re: ROW_NUMBER() equivalent in Hive

Owen,
it's for entire table. the sample TD query looks like below,

SELECT
        columnA
        ,columnB
        , columnC
        , columnD
        , columnX
        ,ROW_NUMBER() OVER (PARTITION BY columnA, columnB, columnC ORDER BY columnX DESC, columnY DESC) AS rank
FROM table a
Regards,
Kumar

-----Original Message-----
From: Owen O'Malley <[EMAIL PROTECTED]>
To: user <[EMAIL PROTECTED]>
Sent: Thu, Feb 21, 2013 8:08 am
Subject: Re: ROW_NUMBER() equivalent in Hive
What are the semantics for ROW_NUMBER? Is it a global row number? Per a partition? Per a bucket?
-- Owen
On Wed, Feb 20, 2013 at 11:33 PM, kumar mr <[EMAIL PROTECTED]> wrote:

Hi,
This is Kumar, and this is my first question in this group.
I have a requirement to implement ROW_NUMBER() from Teradata in Hive where partitioning happens on multiple columns along with multiple column ordering.
It can be easily implemented in Hadoop MR, but I have to do in Hive. By doing in UDF can assign same rank to grouping key considering dataset is small, but ordering need to be done in prior step.
Can we do this in lot simpler way?
Thanks in advance.
Regards,
Kumar

 

 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB