Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - Hadoop counter


Copy link to this message
-
Re: Hadoop counter
Bertrand Dechoux 2012-10-19, 14:50
And by default the number of counters is limited to 120 with the
mapreduce.job.counters.limit property.
They are useful for displaying short statistics about a job but should not
be used for results (imho).
I know people may misuse them but I haven't tried so I wouldn't be able to
list the caveats.

Regards

Bertrand

On Fri, Oct 19, 2012 at 4:35 PM, Michael Segel <[EMAIL PROTECTED]>wrote:

> As I understand it... each Task has its own counters and are independently
> updated. As they report back to the JT, they update the counter(s)' status.
> The JT then will aggregate them.
>
> In terms of performance, Counters take up some memory in the JT so while
> its OK to use them, if you abuse them, you can run in to issues.
> As to limits... I guess that will depend on the amount of memory on the JT
> machine, the size of the cluster (Number of TT) and the number of counters.
>
> In terms of global accessibility... Maybe.
>
> The reason I say maybe is that I'm not sure by what you mean by globally
> accessible.
> If a task creates and implements a dynamic counter... I know that it will
> eventually be reflected in the JT. However, I do not believe that a
> separate Task could connect with the JT and see if the counter exists or if
> it could get a value or even an accurate value since the updates are
> asynchronous.  Not to mention that I don't believe that the counters are
> aggregated until the job ends. It would make sense that the JT maintains a
> unique counter for each task until the tasks complete. (If a task fails, it
> would have to delete the counters so that when the task is restarted the
> correct count is maintained. )  Note, I haven't looked at the source code
> so I am probably wrong.
>
> HTH
> Mike
> On Oct 19, 2012, at 5:50 AM, Lin Ma <[EMAIL PROTECTED]> wrote:
>
> Hi guys,
>
> I have some quick questions regarding to Hadoop counter,
>
>
>    - Hadoop counter (customer defined) is global accessible (for both
>    read and write) for all Mappers and Reducers in a job?
>    - What is the performance and best practices of using Hadoop counters?
>    I am not sure if using Hadoop counters too heavy, there will be performance
>    downgrade to the whole job?
>
> regards,
> Lin
>
>
>
--
Bertrand Dechoux