Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> order by having no effect?!


Copy link to this message
-
Re: order by having no effect?!
Hi Keith,
Hive has 2 sort of (well, not really) similar clauses: sort by and order by. For order by to work, you need to guarantee 1 reducer.

Read up more here:
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+SortBy

Mark
Mark Grover, Business Intelligence Analyst
OANDA Corporation

www: oanda.com www: fxtrade.com

"Best Trading Platform" - World Finance's Forex Awards 2009.
"The One to Watch" - Treasury Today's Adam Smith Awards 2009.
----- Original Message -----
From: "Keith Wiley" <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]
Sent: Tuesday, March 13, 2012 4:54:05 PM
Subject: order by having no effect?!

Um, this is weird.  It simply isn't modifying the order of the returned rows at all.  I get the same result with no 'order by' clause as with one.  Adding a limit or specifying 'asc' has no effect.  Using 'sort by' also has no effect.  The column used for ordering is type INT.  In the example below, I was hoping to "sort" or "order" the results according to the third column.  Like I said, I also tried adding 'limit 10' and/or 'asc' to the end of the query; they had no effect.  The jobtracker shows a single mapper and a single reducer being used for the job incidentally, so it can't be some sort of multi-reducer sort discrepancy.

hive> describe stringmap;
OK
objecttypecode int
attributename string
attributevalue int
langid int
value string
displayorder int
Time taken: 0.074 seconds
hive>
hive> select * from stringmap where attributename='foo' order by 'attributevalue';    
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapred.reduce.tasks=<number>
Starting Job = job_201202221500_0114, Tracking URL = http://localhost:50030/jobdetails.jsp?jobid=job_201202221500_0114
Kill Command = /media/sdb1/kwiley/hadoop/hadoop-0.20.2-cdh3u3/bin/hadoop job  -Dmapred.job.tracker=localhost:9001 -kill job_201202221500_0114
2012-03-13 14:45:28,418 Stage-1 map = 0%,  reduce = 0%
2012-03-13 14:45:31,428 Stage-1 map = 100%,  reduce = 0%
2012-03-13 14:45:39,459 Stage-1 map = 100%,  reduce = 33%
2012-03-13 14:45:40,463 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_201202221500_0114
OK
3 foo 6 1033 6 - aaa 6
3 foo 3 1033 3 - bbb 3
3 foo 4 1033 4 - ccc 4
4 foo 1 1033 Default Value 1
3 foo 2 1033 2 - ddd 2
3 foo 1 1033 1 - eee 1
3 foo 5 1033 5 - fff 5
Time taken: 17.954 seconds
hive>

________________________________________________________________________________
Keith Wiley     [EMAIL PROTECTED]     keithwiley.com    music.keithwiley.com

"Luminous beings are we, not this crude matter."
                                           --  Yoda
________________________________________________________________________________
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB