Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - Invalid Function rank in HiveQL


Copy link to this message
-
Re: Invalid Function rank in HiveQL
Raihan Jamal 2012-07-10, 23:18
Thanks Vijay for reply. But it doesn't works out the way I needed. I am
getting full data back for each BUYER_ID. Let me explain you more.

This is the below data in the table and you can see in the below data this
BUYER_ID *- 1015826235 *appears 13 times and this BUYER_ID -*34512201 *appears
12 times, And I need TOP 10 for each of the BUYER_ID BASIS on time-

BUYER_ID    |    ITEM_ID       |    CREATED_TIME
------------+------------------+-----------------------
1015826235      220003038067        2012-07-09 19:40:21,
1015826235      300003861266        2012-07-09 18:19:59,
1015826235      140002997245        2012-07-09 09:23:17,
1015826235      210002448035        2012-07-09 22:21:11,
1015826235      260003553381        2012-07-09 07:09:56,
1015826235      260003553382        2012-07-09 19:40:39,
1015826235      260003553383        2012-07-09 06:58:47,
1015826235      260003553384        2012-07-09 07:28:47,
1015826235      260003553385        2012-07-09 08:48:47,
1015826235      260003553386        2012-07-09 06:38:47,
1015826235      260003553387        2012-07-09 05:38:47,
1015826235      260003553388        2012-07-09 04:55:47,
1015826235      260003553389        2012-07-09 06:54:37,
34512201        597245693           2012-07-09 16:20:21,
34512201        8071787728          2012-07-09 15:19:59,
34512201        5868222883          2012-07-09 08:23:17,
34512201        2412180494          2012-07-09 22:21:11,
34512201        2422054205          2012-07-09 06:09:56,
34512201        1875744030          2012-07-09 19:40:39,
34512201        5639158173          2012-07-09 06:58:47,
34512201        5656232360          2012-07-09 07:28:47,
34512201        959188449           2012-07-09 08:48:47,
34512201        4645350592          2012-07-09 06:38:47,
34512201        5657320532          2012-07-09 05:38:47,
34512201        290419656539        2012-07-09 04:55:47,
So basically if I do it like this-

SELECT * FROM TestingTable1 ORDER BY buyer_id, created_time DESC;

everything will get sorted in descending order for each BUYER_ID and then I
need to pick up TOP 10 rows for each BUYER_ID.

And with your query that you just said, it's not working. I am getting full
data back like this. Below is the result I got from your query-

34512201        597245693       0       2012-07-09 16:20:21
34512201        959188449       0       2012-07-09 08:48:47
34512201        1875744030      0       2012-07-09 19:40:39
34512201        2412180494      0       2012-07-09 22:21:11
34512201        2422054205      0       2012-07-09 06:09:56
34512201        4645350592      0       2012-07-09 06:38:47
34512201        5639158173      0       2012-07-09 06:58:47
34512201        5656232360      0       2012-07-09 07:28:47
34512201        5657320532      0       2012-07-09 05:38:47
34512201        5868222883      0       2012-07-09 08:23:17
34512201        8071787728      0       2012-07-09 15:19:59
34512201        290419656539    0       2012-07-09 04:55:47
1015826235      140002997245    0       2012-07-09 09:23:17
1015826235      210002448035    0       2012-07-09 22:21:11
1015826235      220003038067    0       2012-07-09 19:40:21
1015826235      260003553381    0       2012-07-09 07:09:56
1015826235      260003553382    0       2012-07-09 19:40:39
1015826235      260003553383    0       2012-07-09 06:58:47
1015826235      260003553384    0       2012-07-09 07:28:47
1015826235      260003553385    0       2012-07-09 08:48:47
1015826235      260003553386    0       2012-07-09 06:38:47
1015826235      260003553387    0       2012-07-09 05:38:47
1015826235      260003553388    0       2012-07-09 04:55:47
1015826235      260003553389    0       2012-07-09 06:54:37
1015826235      300003861266    0       2012-07-09 18:19:59
This is my Rank function that I have created-

package com.example.hive.udf;
import org.apache.hadoop.hive.ql.exec.UDF;

public final class Rank extends UDF{
    private int  counter;
    private String last_key;
    public int evaluate(final String key){
  if ( !key.equalsIgnoreCase(this.last_key) ) {
     this.counter = 0;
     this.last_key = key;
  }
  return this.counter++;
    }
}
*Raihan Jamal*

On Tue, Jul 10, 2012 at 4:01 PM, Vijay <[EMAIL PROTECTED]> wrote: