-Re: About MapReduce's Setup
Hemanth Yamijala 2010-10-23, 03:09
Apologies for a very delayed response.
The setup task is under the control of the user, and the user can
provide an implementation that makes sense for his/her M/R job. That
said, like with other APIs in M/R, there is an implementation that
comes in the library for the common use cases. For e.g. the setup task
sets up the 'temporary' output directory on HDFS into which tasks
write their output.
The M/R documentation has more information and you could refer to that.
Hope that helps.
On Thu, Oct 14, 2010 at 10:11 AM, He Chen <[EMAIL PROTECTED]> wrote:
> Hi Hemanth
> Thank you for your kindly reply. Do you know what really the setup do? Does
> it will take the data locality into account?
> On Wed, Oct 13, 2010 at 11:38 PM, Hemanth Yamijala <[EMAIL PROTECTED]>wrote:
>> If you are talking about the 'Setup task' that is used to initialize
>> or setup the job, yes, it can run on either the map slot or reduce
>> slot depending on what is available.
>> On Thu, Oct 14, 2010 at 1:54 AM, He Chen <[EMAIL PROTECTED]> wrote:
>> > Hi, all
>> > I found out that if the there is no map slot, Hadoop will use reduce slot
>> > setup mapreduce job when I submit a series of jobs.
>> > The first two jobs setuped themselves with a MapAttempt. However, they
>> > occupy all the map slots. When the third job comes, I find out it uses
>> > ReduceAttempt to du the setup. After that, no more setup logged in the
>> > history.
>> > Am I correct?
>> > Chen