Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> ROW_NUMBER() equivalent in Hive

kumar mr 2013-02-21, 07:33
Owen OMalley 2013-02-21, 16:08
Copy link to this message
Re: ROW_NUMBER() equivalent in Hive

If you are willing to be on bleeding edge, this and many other partitioning
and windowing functionality some of us are developing in a branch over at:
Check out this branch, build hive and than you can have row_number()
functionality. Look in
ql/src/test/queries/clientpositive/ptf_general_queries.q which has about 60
or so example queries demonstrating various capabilities which we have
already working (including row_number).
We hope to have this branch merged in trunk soon.

Hope it helps,
On Wed, Feb 20, 2013 at 11:33 PM, kumar mr <[EMAIL PROTECTED]> wrote:

> Hi,
>  This is Kumar, and this is my first question in this group.
>  I have a requirement to implement ROW_NUMBER() from Teradata in Hive
> where partitioning happens on multiple columns along with multiple column
> ordering.
> It can be easily implemented in Hadoop MR, but I have to do in Hive. By
> doing in UDF can assign same rank to grouping key considering dataset is
> small, but ordering need to be done in prior step.
> Can we do this in lot simpler way?
>  Thanks in advance.
>  Regards,
> Kumar
Stephen Boesch 2013-02-21, 20:17
Ashutosh Chauhan 2013-02-22, 01:44
kumar mr 2013-02-21, 19:12