Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Hadoop counter


Copy link to this message
-
Re: Hadoop counter
And by default the number of counters is limited to 120 with the
mapreduce.job.counters.limit property.
They are useful for displaying short statistics about a job but should not
be used for results (imho).
I know people may misuse them but I haven't tried so I wouldn't be able to
list the caveats.

Regards

Bertrand

On Fri, Oct 19, 2012 at 4:35 PM, Michael Segel <[EMAIL PROTECTED]>wrote:

> As I understand it... each Task has its own counters and are independently
> updated. As they report back to the JT, they update the counter(s)' status.
> The JT then will aggregate them.
>
> In terms of performance, Counters take up some memory in the JT so while
> its OK to use them, if you abuse them, you can run in to issues.
> As to limits... I guess that will depend on the amount of memory on the JT
> machine, the size of the cluster (Number of TT) and the number of counters.
>
> In terms of global accessibility... Maybe.
>
> The reason I say maybe is that I'm not sure by what you mean by globally
> accessible.
> If a task creates and implements a dynamic counter... I know that it will
> eventually be reflected in the JT. However, I do not believe that a
> separate Task could connect with the JT and see if the counter exists or if
> it could get a value or even an accurate value since the updates are
> asynchronous.  Not to mention that I don't believe that the counters are
> aggregated until the job ends. It would make sense that the JT maintains a
> unique counter for each task until the tasks complete. (If a task fails, it
> would have to delete the counters so that when the task is restarted the
> correct count is maintained. )  Note, I haven't looked at the source code
> so I am probably wrong.
>
> HTH
> Mike
> On Oct 19, 2012, at 5:50 AM, Lin Ma <[EMAIL PROTECTED]> wrote:
>
> Hi guys,
>
> I have some quick questions regarding to Hadoop counter,
>
>
>    - Hadoop counter (customer defined) is global accessible (for both
>    read and write) for all Mappers and Reducers in a job?
>    - What is the performance and best practices of using Hadoop counters?
>    I am not sure if using Hadoop counters too heavy, there will be performance
>    downgrade to the whole job?
>
> regards,
> Lin
>
>
>
--
Bertrand Dechoux
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB