Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> Re: Need rank()


+
Keith Wiley 2013-04-02, 17:03
+
Nitin Pawar 2013-04-02, 17:06
+
Keith Wiley 2013-04-02, 17:15
+
Nitin Pawar 2013-04-02, 17:22
+
Keith Wiley 2013-04-02, 19:44
Copy link to this message
-
Re: Need rank()
I'm pretty sure there are at least a few bugs in that article.  The problem described below goes away if "rank" is added to the outer select statement.  If there's one error I suppose there may be others.  Although I have gotten the query to run, I'm not getting properly ranked results yet.  I'm still working on it.

On Apr 2, 2013, at 10:03 , Keith Wiley wrote:

> On Apr 1, 2013, at 16:12 , Alexander Pivovarov wrote:
>
>> http://ragrawal.wordpress.com/2011/11/18/extract-top-n-records-in-each-group-in-hadoophive/
>
> Is there any possibility there is a bug in Ritesh Agrawal's query statement from that article?  I created a test table with the exact column names from the example in the article and used a minimally altered version of the command (I removed the where clause to simplify things a bit) and got an error which suggests there is something slightly wrong with the command (or perhaps the table has to be configured a special way).  Here's what I get when I almost perfectly duplicate that example:
>
> hive> describe test;
> OK
> user string
> category string
> value int
> Time taken: 0.082 seconds
> =================================================> hive> select * from test;
> OK
> user1 cat1 1
> user1 cat1 2
> user1 cat1 3
> user1 cat2 10
> user1 cat2 20
> user1 cat2 30
> user2 cat1 11
> user2 cat1 21
> user2 cat1 31
> user2 cat2 5
> user2 cat2 6
> user2 cat2 7
> Time taken: 0.202 seconds
> =================================================> hive> SELECT user, category, value
>> FROM (
>> SELECT user, category, rank(user) as rank, value
>> FROM test
>> DISTRIBUTE BY user
>> SORT BY user, value desc
>> ) a
>> WHERE rank > 1
>> ORDER BY user, rank;
> FAILED: Error in semantic analysis: Line 9:15 Invalid table alias or column reference 'rank': (possible column names are: _col0, _col1, _col2)
> hive>
>
> ________________________________________________________________________________
> Keith Wiley     [EMAIL PROTECTED]     keithwiley.com    music.keithwiley.com
>
> "What I primarily learned in grad school is how much I *don't* know.
> Consequently, I left grad school with a higher ignorance to knowledge ratio than
> when I entered."
>                                           --  Keith Wiley
> ________________________________________________________________________________
>
________________________________________________________________________________
Keith Wiley     [EMAIL PROTECTED]     keithwiley.com    music.keithwiley.com

"Yet mark his perfect self-contentment, and hence learn his lesson, that to be
self-contented is to be vile and ignorant, and that to aspire is better than to
be blindly and impotently happy."
                                           --  Edwin A. Abbott, Flatland
________________________________________________________________________________
+
Igor Tatarinov 2013-04-02, 17:56
+
Keith Wiley 2013-04-02, 19:51
+
Edward Capriolo 2013-04-02, 20:14
+
Edward Capriolo 2013-04-02, 19:59
+
Keith Wiley 2013-04-02, 20:44
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB