Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Query crawls through reducer

Copy link to this message
Query crawls through reducer
The following query translates into a many-map-single-reduce job (which is common) and also slags through the reduce stage...it's killing the overall query:

select * from a where b >= 'c' order by b desc limit 100

Note that b is a partition.  What component is making the reducer heavy?  Is it the order by or the limit (I'm sure it's not the partition-specific where clause, right?)?  Are there ways to improve its performance?

Keith Wiley     [EMAIL PROTECTED]     keithwiley.com    music.keithwiley.com

"You can scratch an itch, but you can't itch a scratch. Furthermore, an itch can
itch but a scratch can't scratch. Finally, a scratch can itch, but an itch can't
scratch. All together this implies: He scratched the itch from the scratch that
itched but would never itch the scratch from the itch that scratched."
                                           --  Keith Wiley