Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Re: Need rank()


git clone https://github.com/edwardcapriolo/hive_test.git
cd hive_test/
mvn -Dmaven.test.skip=true install
cd ..
[edward@jackintosh java]$ git clone
https://github.com/edwardcapriolo/hive-rank.git
Cloning into 'hive-rank'...
remote: Counting objects: 74, done.
remote: Compressing objects: 100% (35/35), done.
remote: Total 74 (delta 12), reused 70 (delta 8)
Unpacking objects: 100% (74/74), done.
[edward@jackintosh java]$ cd hive-rank/
[edward@jackintosh hive-rank]$ mvn install -Dmaven.test.skip=true
/usr/java/jdk1.7.0_13
...
[INFO] Installing
/home/edward/Documents/java/hive-rank/target/hive-rank-1.0.0-SNAPSHOT.jar
to
/home/edward/.m2/repository/com/m6d/hive-rank/1.0.0-SNAPSHOT/hive-rank-1.0.0-SNAPSHOT.jar
[INFO] Installing /home/edward/Documents/java/hive-rank/pom.xml to
/home/edward/.m2/repository/com/m6d/hive-rank/1.0.0-SNAPSHOT/hive-rank-1.0.0-SNAPSHOT.pom
[INFO]
------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO]
------------------------------------------------------------------------
[INFO] Total time: 2.986s
[INFO] Finished at: Tue Apr 02 16:11:41 EDT 2013
[INFO] Final Memory: 17M/210M
[INFO]
------------------------------------------------------------------------

Then copy:
cp
/home/edward/Documents/java/hive-rank/target/hive-rank-1.0.0-SNAPSHOT.jar
to
/home/edward/.m2/repository/com/m6d/hive-rank/1.0.0-SNAPSHOT/hive-rank-1.0.0-SNAPSHOT.jar
to your hadoop lib.

add jar <name of jar file>
..... etc etc
On Tue, Apr 2, 2013 at 3:51 PM, Keith Wiley <[EMAIL PROTECTED]> wrote:

> Yep, the original article is definitely erroneous in this regard.  I
> figured out that eventually.  I'm not sure how much I can trust that
> resource now.  I may have to look elsewhere.  I agree that Edward's
> description is pretty good, but as I said earlier, I can't actually use his
> code, so I'm trying to cobble a workable solution together from the various
> resources available.  Ritesh's article, despite the error in the Hive
> syntax, is still useful in that it enables one to quickly compile a simple
> rank jar without relying on git, maven, or other project dependencies --
> problems which have plagued me with Edward's approach.  So, if I can use
> Ritesh's method to write a simple rank function, and Edward's accurate
> description of how to construct the query, then I can put all the pieces
> together into a workable solution.
>
> I'll let you know if I get it.
>
> On Apr 2, 2013, at 10:56 , Igor Tatarinov wrote:
>
> > You are getting the error because you are ORDERing BY rank but rank is
> not in the top SELECT
> >
> > Also, DISTRIBUTE BY/SORT BY are done after SELECT so you have to use a
> subquery:
> > SELECT ..., rank(user)
> > FROM (SELECT ... DISTRIBUTE BY ... SORT BY)
> >
> > igor
> > decide.com
> >
> >
> > On Tue, Apr 2, 2013 at 10:03 AM, Keith Wiley <[EMAIL PROTECTED]>
> wrote:
> > On Apr 1, 2013, at 16:12 , Alexander Pivovarov wrote:
> >
> > >
> http://ragrawal.wordpress.com/2011/11/18/extract-top-n-records-in-each-group-in-hadoophive/
> >
> > Is there any possibility there is a bug in Ritesh Agrawal's query
> statement from that article?  I created a test table with the exact column
> names from the example in the article and used a minimally altered version
> of the command (I removed the where clause to simplify things a bit) and
> got an error which suggests there is something slightly wrong with the
> command (or perhaps the table has to be configured a special way).  Here's
> what I get when I almost perfectly duplicate that example:
> >
> > hive> describe test;
> > OK
> > user    string
> > category        string
> > value   int
> > Time taken: 0.082 seconds
> > =================================================> > hive> select * from test;
> > OK
> > user1   cat1    1
> > user1   cat1    2
> > user1   cat1    3
> > user1   cat2    10
> > user1   cat2    20
> > user1   cat2    30
> > user2   cat1    11
> > user2   cat1    21
> > user2   cat1    31
> > user2   cat2    5
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB