Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # dev >> non map-reduce for simple queries

Namit Jain 2012-07-28, 20:35
Edward Capriolo 2012-07-28, 21:41
Navis류승우 2012-07-29, 01:17
Namit Jain 2012-07-29, 14:28
Namit Jain 2012-07-29, 14:45
Owen OMalley 2012-07-30, 21:01
Navis류승우 2012-07-31, 01:37
Namit Jain 2012-07-31, 04:12
Copy link to this message
Re: non map-reduce for simple queries
On Mon, Jul 30, 2012 at 9:12 PM, Namit Jain <[EMAIL PROTECTED]> wrote:

> The total number of bytes of the input will be used to determine whether
> to not launch a map-reduce job for this
> query. That was in my original mail.
> However, given any complex where condition and the lack of column
> statistics in hive, we cannot determine the
> number of bytes that would be needed to satisfy the where condition.
All of these are heuristics are guidelines, clearly. My inclination would
be to use the maximum data volume as the primary metric until we have a
better understanding of cases where that doesn't work well. If we are going
to try the local solution and fall back to mapreduce, it seems better to
put a limit well short of being done so that you don't waste as much work.
Perhaps, if the query isn't 10% done in the first 5 seconds of running
locally, you switch to mapreduce. Would that work?

-- Owen
Namit Jain 2012-07-31, 06:38
Owen OMalley 2012-07-31, 15:53
Namit Jain 2012-07-31, 17:47