Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - How to make the query compiler not determine the number of reducers?


Copy link to this message
-
Re: How to make the query compiler not determine the number of reducers?
Bejoy KS 2012-04-30, 14:14
Thomas,

     It needn't be the case, raising your map tasks may not have any effect on reduce tasks. May be we can help you out if you could provide some details like :
 - the query you are executing
- describe formatted on the tables involved in query

  
Regards
Bejoy KS

Sent from handheld, please excuse typos.

-----Original Message-----
From: "Ryabin, Thomas" <[EMAIL PROTECTED]>
Date: Mon, 30 Apr 2012 10:06:01
To: <[EMAIL PROTECTED]>
Reply-To: [EMAIL PROTECTED]
Subject: RE: How to make the query compiler not determine the number of reducers?

I tried using this to set the number of reduce tasks to 2, but it
doesn't work for me. In my case the Hive query always creates 8 map
tasks and 1 reduce task. Could the number of reduce tasks be limited by
the number of map tasks, so that if I wanted 2 reduce tasks I would need
to increase the number of map tasks to 16 in my case?

 

-Thomas

 

From: Bejoy KS [mailto:[EMAIL PROTECTED]]
Sent: Saturday, April 28, 2012 1:43 AM
To: [EMAIL PROTECTED]
Subject: Re: How to make the query compiler not determine the number of
reducers?

 

Hi Thomas
Hive automatically sets the number of reducers for you. But you can
easily override them at CLI. Before executing your query
hive>SET mapred.reduce.tasks=n;

Where n is the required num of reducers.

Regards
Bejoy KS

Sent from handheld, please excuse typos.

  _____  

From: "Ryabin, Thomas" <[EMAIL PROTECTED]>

Date: Fri, 27 Apr 2012 16:48:25 -0400

To: <[EMAIL PROTECTED]>

ReplyTo: [EMAIL PROTECTED]

Subject: How to make the query compiler not determine the number of
reducers?

 

Hi,

 

When I run a query that uses a custom UDF I made, one of the lines it
prints out is:

Number of reduce tasks determined at compile time: 1

 

And this causes the MapReduce job to have only 1 reducer. Is there a way
to make it so the compiler does not determine the number of reduce tasks
to create, so I can specify the number myself?

 

The query in question is:

select test_udf(name, store) from employees join stores;

 

Thanks,

Thomas