Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Hadoop counter

Copy link to this message
Re: Hadoop counter
Thanks for the long discussion Mile. Learned a lot from you.


On Tue, Oct 23, 2012 at 11:57 AM, Michael Segel

> Yup.
> The counters at the end of the job are the most accurate.
> On Oct 22, 2012, at 3:00 AM, Lin Ma <[EMAIL PROTECTED]> wrote:
> Thanks for the help so much, Mike. I learned a lot from this discussion.
> So, the conclusion I learned from the discussion should be, since how/when
> JT merge counter in the middle of the process of a job is undefined and
> internal behavior, it is more reliable to read counter after the whole job
> completes? Agree?
> regards,
> Lin
> On Sun, Oct 21, 2012 at 8:15 PM, Michael Segel <[EMAIL PROTECTED]>wrote:
>> On Oct 21, 2012, at 1:45 AM, Lin Ma <[EMAIL PROTECTED]> wrote:
>> Thanks for the detailed reply, Mike. Yes, my most confusion is resolved
>> by you. The last two questions (or comments) are used to confirm my
>> understanding is correct,
>> - is it normal use case or best practices for a job to consume/read the
>> counters from previous completed job in an automatic way? I ask this
>> because I am not sure whether the most use case of counter is human read
>> and manual analysis, other then using another job to automatic consume the
>> counters?
>> Lin,
>> Every job has a set of counters to maintain job statistics.
>> This is specifically for human analysis and to help understand what
>> happened with your job.
>> It allows you to see how much data is read in by the job, how many
>> records processed to be measured against how long the job took to complete.
>>  It also showed you how much data is written back out.
>> In addition to this,  a set of use cases for counters in Hadoop center on
>> quality control. Its normal to chain jobs together to form a job flow.
>> A typical use case for Hadoop is to pull data from various sources,
>> combine them and do some process on them, resulting in a data set that gets
>> sent to another system for visualization.
>> In this use case, there are usually data cleansing and validation jobs.
>> As they run, its possible to track a number of defective records. At the
>> end of that specific job, from the ToolRunner, or whichever job class you
>> used to launch your job, you can then get these aggregated counters for the
>> job and determine if the process passed or failed.  Based on this, you can
>> exit your program with either a success or failed flag.  Job Flow control
>> tools like Oozie can capture this and then decide to continue or to stop
>> and alert an operator of an error.
>> - I want to confirm my understanding is correct, when each task
>> completes, JT will aggregate/update the global counter values from the
>> specific counter values updated by the complete task, but never expose
>> global counters values until job completes? If it is correct, I am
>> wondering why JT doing aggregation each time when a task completes, other
>> than doing a one time aggregation when the job completes? Is there any
>> design choice reasons? thanks.
>> That's a good question. I haven't looked at the code, so I can't say
>> definitively when the JT performs its aggregation. However, as the job runs
>> and in process, we can look at the job tracker web page(s) and see the
>> counter summary. This would imply that there has to be some aggregation
>> occurring mid-flight. (It would be trivial to sum the list of counters
>> periodically to update the job statistics.)  Note too that if the JT web
>> pages can show a counter, its possible to then write a monitoring tool that
>> can monitor the job while running and then kill the job mid flight if a
>> certain threshold of a counter is met.
>> That is to say you could in theory write a monitoring process and watch
>> the counters. If lets say an error counter hits a predetermined threshold,
>> you could then issue a 'hadoop job -kill <job-id>' command.
>> regards,
>> Lin
>> On Sat, Oct 20, 2012 at 3:12 PM, Michael Segel <[EMAIL PROTECTED]