-Re: "attempt*" directories in user logs
Hemanth Yamijala 2012-12-11, 04:03
However, in the case Oleg is talking about the attempts are:
These aren't multiple attempts of a single task, are they ? They are
actually different tasks. If they were multiple attempts, I would expect
the last digit to get incremented, like attempt_201212051224_0021_m_000000_0
and attempt_201212051224_0021_m_000000_1, for instance.
It looks like at least 3 different tasks were launched on this node. One of
them could be setup task. Oleg, how many map tasks does the Jobtracker UI
show for this job.
On Tue, Dec 11, 2012 at 12:19 AM, Vinod Kumar Vavilapalli <
[EMAIL PROTECTED]> wrote:
> MR launches multiple attempts for single Task in case of TaskAttempt
> failures or when speculative execution is turned on. In either case, a
> given Task will only ever have one successful TaskAttempt whose output will
> be accepted (committed).
> Number of reduces is set to 1 by default in mapred-default.xml - you
> should explicitly set it to zero if you don't want reducers.
> By master, I suppose you mean JobTracker. JobTracker doesn't show all the
> attempts for a given Task, you should navigate to per-task page to see that.
> +Vinod Kumar Vavilapalli
> Hortonworks Inc.
> On Dec 9, 2012, at 6:53 AM, Oleg Zhurakousky wrote:
> I studying user logs on the two node cluster that I have setup and I was
> wondering if anyone can shed some light on these "attempt*' directories
> $ ls
> attempt_201212051224_0021_m_000000_0 attempt_201212051224_0021_m_000003_0
> attempt_201212051224_0021_m_000002_0 attempt_201212051224_0021_r_000000_0
> I mean its obvious that its talking about 3 attempts for Map task and 1
> attempt for reduce task. However my current MR job only results in some
> output written to "attempt_201212051224_0021_m_000000_0". Nothing is the
> reduce part (understandably since I don't even have a reducer, so my
> question is:
> 1. The two more M attempts. . . what are they?
> 2. Why was there an attempt to do a Reduce when no reducer was
> 3. Why my master node only had 1 attempt for M task but the slave had all
> that's displayed and questioned above (the 'ls' output above is from the
> slave node)