Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> Re: Need rank()


+
Keith Wiley 2013-04-02, 17:03
+
Nitin Pawar 2013-04-02, 17:06
+
Keith Wiley 2013-04-02, 17:15
+
Nitin Pawar 2013-04-02, 17:22
+
Keith Wiley 2013-04-02, 19:44
+
Keith Wiley 2013-04-02, 19:48
Copy link to this message
-
Re: Need rank()
You are getting the error because you are ORDERing BY rank but rank is not
in the top SELECT

Also, DISTRIBUTE BY/SORT BY are done after SELECT so you have to use a
subquery:
SELECT ..., rank(user)
FROM (SELECT ... DISTRIBUTE BY ... SORT BY)

igor
decide.com
On Tue, Apr 2, 2013 at 10:03 AM, Keith Wiley <[EMAIL PROTECTED]> wrote:

> On Apr 1, 2013, at 16:12 , Alexander Pivovarov wrote:
>
> >
> http://ragrawal.wordpress.com/2011/11/18/extract-top-n-records-in-each-group-in-hadoophive/
>
> Is there any possibility there is a bug in Ritesh Agrawal's query
> statement from that article?  I created a test table with the exact column
> names from the example in the article and used a minimally altered version
> of the command (I removed the where clause to simplify things a bit) and
> got an error which suggests there is something slightly wrong with the
> command (or perhaps the table has to be configured a special way).  Here's
> what I get when I almost perfectly duplicate that example:
>
> hive> describe test;
> OK
> user    string
> category        string
> value   int
> Time taken: 0.082 seconds
> =================================================> hive> select * from test;
> OK
> user1   cat1    1
> user1   cat1    2
> user1   cat1    3
> user1   cat2    10
> user1   cat2    20
> user1   cat2    30
> user2   cat1    11
> user2   cat1    21
> user2   cat1    31
> user2   cat2    5
> user2   cat2    6
> user2   cat2    7
> Time taken: 0.202 seconds
> =================================================> hive> SELECT user, category, value
>     > FROM (
>     > SELECT user, category, rank(user) as rank, value
>     > FROM test
>     > DISTRIBUTE BY user
>     > SORT BY user, value desc
>     > ) a
>     > WHERE rank > 1
>     > ORDER BY user, rank;
> FAILED: Error in semantic analysis: Line 9:15 Invalid table alias or
> column reference 'rank': (possible column names are: _col0, _col1, _col2)
> hive>
>
>
> ________________________________________________________________________________
> Keith Wiley     [EMAIL PROTECTED]     keithwiley.com
> music.keithwiley.com
>
> "What I primarily learned in grad school is how much I *don't* know.
> Consequently, I left grad school with a higher ignorance to knowledge
> ratio than
> when I entered."
>                                            --  Keith Wiley
>
> ________________________________________________________________________________
>
>
+
Keith Wiley 2013-04-02, 19:51
+
Edward Capriolo 2013-04-02, 20:14
+
Edward Capriolo 2013-04-02, 19:59
+
Keith Wiley 2013-04-02, 20:44
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB