Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - Hadoop counter


Copy link to this message
-
Re: Hadoop counter
Lin Ma 2012-10-19, 16:09
Hi Harsh,

Thanks for the great reply. Two basic questions,

- Where the counters' value are stored for successful job? On JT?
- Supposing a specific job A completed successfully and updated related
counters, is it possible for another specific job B to read counters
updated by previous job A? If yes, how?

regards,
Lin

On Fri, Oct 19, 2012 at 11:50 PM, Harsh J <[EMAIL PROTECTED]> wrote:

> Bejoy is almost right, except that counters are reported upon progress
> of tasks itself (via TT heartbeats to JT actually), but the final
> counter representation is computed only with successful task reports
> the job received, not from any failed or killed ones.
>
> On Fri, Oct 19, 2012 at 8:51 PM, Bejoy KS <[EMAIL PROTECTED]> wrote:
> > Hi Jay
> >
> > Counters are reported at the end of a task to JT. So if a task fails the
> > counters from that task are not send to JT and hence won't be included in
> > the final value of counters from that Job.
> > Regards
> > Bejoy KS
> >
> > Sent from handheld, please excuse typos.
> > ________________________________
> > From: Jay Vyas <[EMAIL PROTECTED]>
> > Date: Fri, 19 Oct 2012 10:18:42 -0500
> > To: <[EMAIL PROTECTED]>
> > ReplyTo: [EMAIL PROTECTED]
> > Subject: Re: Hadoop counter
> >
> > Ah this answers alot about why some of my dynamic counters never show up
> and
> > i have to bite my nails waiting to see whats going on until the end of
> the
> > job- thanks.
> >
> > Another question: what happens if a task fails ?  What happen to the
> > counters for it ?  Do they dissappear into the ether? Or do they get
> merged
> > in with the counters from other tasks?
> >
> > On Fri, Oct 19, 2012 at 9:50 AM, Bertrand Dechoux <[EMAIL PROTECTED]>
> > wrote:
> >>
> >> And by default the number of counters is limited to 120 with the
> >> mapreduce.job.counters.limit property.
> >> They are useful for displaying short statistics about a job but should
> not
> >> be used for results (imho).
> >> I know people may misuse them but I haven't tried so I wouldn't be able
> to
> >> list the caveats.
> >>
> >> Regards
> >>
> >> Bertrand
> >>
> >>
> >> On Fri, Oct 19, 2012 at 4:35 PM, Michael Segel <
> [EMAIL PROTECTED]>
> >> wrote:
> >>>
> >>> As I understand it... each Task has its own counters and are
> >>> independently updated. As they report back to the JT, they update the
> >>> counter(s)' status.
> >>> The JT then will aggregate them.
> >>>
> >>> In terms of performance, Counters take up some memory in the JT so
> while
> >>> its OK to use them, if you abuse them, you can run in to issues.
> >>> As to limits... I guess that will depend on the amount of memory on the
> >>> JT machine, the size of the cluster (Number of TT) and the number of
> >>> counters.
> >>>
> >>> In terms of global accessibility... Maybe.
> >>>
> >>> The reason I say maybe is that I'm not sure by what you mean by
> globally
> >>> accessible.
> >>> If a task creates and implements a dynamic counter... I know that it
> will
> >>> eventually be reflected in the JT. However, I do not believe that a
> separate
> >>> Task could connect with the JT and see if the counter exists or if it
> could
> >>> get a value or even an accurate value since the updates are
> asynchronous.
> >>> Not to mention that I don't believe that the counters are aggregated
> until
> >>> the job ends. It would make sense that the JT maintains a unique
> counter for
> >>> each task until the tasks complete. (If a task fails, it would have to
> >>> delete the counters so that when the task is restarted the correct
> count is
> >>> maintained. )  Note, I haven't looked at the source code so I am
> probably
> >>> wrong.
> >>>
> >>> HTH
> >>> Mike
> >>> On Oct 19, 2012, at 5:50 AM, Lin Ma <[EMAIL PROTECTED]> wrote:
> >>>
> >>> Hi guys,
> >>>
> >>> I have some quick questions regarding to Hadoop counter,
> >>>
> >>> Hadoop counter (customer defined) is global accessible (for both read
> and
> >>> write) for all Mappers and Reducers in a job?
> >>> What is the performance and best practices of using Hadoop counters? I