Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - Use distribute to spread across reducers


Copy link to this message
-
Use distribute to spread across reducers
Keith Wiley 2013-10-02, 18:48
I'm trying to create a subset of a large table for testing.  The following approach works:

create table subset_table as
select * from large_table limit 1000

...but it only uses one reducer.  I would like to speed up the process of creating a subset but distributing across multiple reducers.  I already tried explicitly setting mapred.reduce.tasks and hive.exec.reducers.max to values larger than 1, but in this particular case, those values seem to be over-ridden by Hive's internal query->to->mapreduce conversion; it ignores those parameters.

So, I tried this:

create table subset_table as
select * from large_table limit 1000
distribute by column_name

...but that doesn't parse.  I get the following error:

OK FAILED: ParseException line 3:0 missing EOF at 'distribute' near '1000'.

I have tried NUMEROUS applications of parentheses, nested queries, etc.  For example, here's just one (amongst perhaps ten variations on a theme):

create table subset_table as
select * from (
from (
select * from large_table limit 1000
distribute by column_name
)) s

Like I said, I've tried all sorts of combinations of the elements shown above.  So far I have not even gotten any syntax to parse, much less run.  Only the original query at the top will even pass the parsing stage of processing.

Any ideas?

Thanks.

________________________________________________________________________________
Keith Wiley     [EMAIL PROTECTED]     keithwiley.com    music.keithwiley.com

"I do not feel obliged to believe that the same God who has endowed us with
sense, reason, and intellect has intended us to forgo their use."
                                           --  Galileo Galilei
________________________________________________________________________________