Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Find TOP 10 using HiveQL


Copy link to this message
-
Re: Find TOP 10 using HiveQL
This is my below requirement. I need-  *Find `TOP 10` data for each
`BUYER_ID, *So I cannot use LIMIT 10 here in this case.

This is the below table

    CREATE TABLE IF NOT EXISTS TestingTable1
    (
    BUYER_ID BIGINT,
    ITEM_ID BIGINT,
    CREATED_TIME STRING
    )

And this is the below data in the above table-

    BUYER_ID    |    ITEM_ID   | CREATED_TIME
    ------------+------------------+-----------------------
    1015826235      220003038067        2012-07-09 19:40:21,
    1015826235      300003861266        2012-07-09 18:19:59,
    1015826235      140002997245        2012-07-09 09:23:17,
    1015826235      210002448035        2012-07-09 22:21:11,
    1015826235      260003553381        2012-07-09 07:09:56,
    1015826235      260003553382        2012-07-09 19:40:39,
    1015826235      260003553383        2012-07-09 06:58:47,
    1015826235      260003553384        2012-07-09 07:28:47,
    1015826235      260003553385        2012-07-09 08:48:47,
    1015826235      260003553386        2012-07-09 06:38:47,
    1015826235      260003553387        2012-07-09 05:38:47,
    1015826235      260003553388        2012-07-09 04:55:47,
    1015826235      260003553389        2012-07-09 06:54:37,
    34512201        597245693           2012-07-09 16:20:21,
    34512201        8071787728          2012-07-09 15:19:59,
    34512201        5868222883          2012-07-09 08:23:17,
    34512201        2412180494          2012-07-09 22:21:11,
    34512201        2422054205          2012-07-09 06:09:56,
    34512201        1875744030          2012-07-09 19:40:39,
    34512201        5639158173          2012-07-09 06:58:47,
    34512201        5656232360          2012-07-09 07:28:47,
    34512201        959188449 2012-07-09 08:48:47,
    34512201        4645350592          2012-07-09 06:38:47,
    34512201        5657320532 2012-07-09 05:38:47,
    34512201        290419656539 2012-07-09 04:55:47,

If you see the above data in the table, there are only two UNIQUE
`BUYER_ID` and corresponding to those I have `ITEM_ID` AND `CREATED_TIME`.
I need only 10 latest record basis on the time for that given day for each
`BUYER_ID`.

So for this `BUYER_ID` - `34512201` I need 10 latest record basis on
`CREATED_TIME` for that given day only, it means for today's date I need 10
latest record for each `BUYER_ID`.

And each `BUYER_ID` can have any day's data. But I am specifically
interested for day before  today's data(means yesterday's date always) by
checking at the `CREATED_TIME`

**Find `TOP 10` data for each `BUYER_ID`. Below is the sample output.**

Sample Output.

    BUYER_ID    |    ITEM_ID   | CREATED_TIME
    ------------+------------------+-----------------------
    34512201        2412180494          2012-07-09 22:21:11
    34512201        1875744030          2012-07-09 19:40:39
    34512201        597245693           2012-07-09 16:20:21
    34512201        8071787728          2012-07-09 15:19:59
    34512201        959188449 2012-07-09 08:48:47
    34512201        5868222883          2012-07-09 08:23:17
    34512201        5656232360          2012-07-09 07:28:47
    34512201        5639158173          2012-07-09 06:58:47
    34512201        4645350592          2012-07-09 06:38:47
    34512201        2422054205          2012-07-09 06:09:56
    1015826235      210002448035        2012-07-09 22:21:11
    1015826235      260003553382        2012-07-09 19:40:39
    1015826235      220003038067        2012-07-09 19:40:21
    1015826235      300003861266        2012-07-09 18:19:59
    1015826235      140002997245        2012-07-09 09:23:17
    1015826235      260003553385        2012-07-09 08:48:47
    1015826235      260003553384        2012-07-09 07:28:47
    1015826235      260003553381        2012-07-09 07:09:56
    1015826235      260003553383        2012-07-09 06:58:47
    1015826235      260003553389        2012-07-09 06:54:37
*Raihan Jamal*

On Mon, Jul 9, 2012 at 7:56 PM, Andes <[EMAIL PROTECTED]> wrote:

> **
> hello, you can use "desc" and "limit 10" to filter the top 10.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB