Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> Hive Query Unable to distribute load evenly in reducers


+
Saurabh Mishra 2012-10-15, 12:09
Copy link to this message
-
Re: Hive Query Unable to distribute load evenly in reducers
And your queries were?

On Mon, Oct 15, 2012 at 8:09 PM, Saurabh Mishra
<[EMAIL PROTECTED]> wrote:
> Hi,
> I am firing some hive queries joining tables containing upto 30millions
> records each. Since the load on the reducers is very significant in these
> cases, i specifically set the following parameters before executing the
> queries :
>
> set mapred.reduce.tasks=100;
> set hive.exec.reducers.bytes.per.reducer=500000000;
> set hive.optimize.cp=true;
>
> The number of reducer the job spouts in now 160, but despite the high number
> most of the load remains upon 1 or 2 reducers. Hence in the final
> statistics, 158 reducers go completed with 2-3 minutes of start and 2
> reducers took 2 hrs to run.
> Is there any way to overcome this load distribution disparity.
> Any help in this regards will be highly appreciated.
>
> Sincerely
> Saurabh Mishra
+
Saurabh Mishra 2012-10-15, 14:23
+
Philip Tromans 2012-10-15, 15:29
+
Saurabh Mishra 2012-10-15, 20:45
+
Navis류승우 2012-10-16, 05:17
+
Saurabh Mishra 2012-10-16, 05:53
+
Saurabh Mishra 2012-10-18, 08:56
+
Philip Tromans 2012-10-18, 09:03