Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS >> mail # user >> Re: System.out.printlin vs Counters


+
zheyi rong 2013-03-27, 10:38
Copy link to this message
-
Re: System.out.printlin vs Counters
While using System.out inside a Mapper or Reducer is fine as an aid to
learning, be careful: accidentally leaving them in (or not moving to
something like log4J) and running the job for real can mean writing
millions of lines of log output on a tasktracker, filling up disks and
making jobs needlessly slow.

Paul
On 27 March 2013 10:38, zheyi rong <[EMAIL PROTECTED]> wrote:

> Hello,
>
> Q1.
> Depends on your need. If you would like an overall statistics, for
> example, the number of the malformed records in your datasets,
> use counters. If you just want to know what is going on inside a mapper or
> reducer, use System.out.println;
> since mappers do not know each other, you cannot get an overall statistics
> of your job by using System.out.println().
> The output of  System.out.println() will finally appear in the tasklog.
>
> Q2.
> In a distributed environment, mappers do not know each other. Imagine that
> mapper A is running on a machine, and mapper B is running on another
> machine, so in mapper A, you cannot get the internal state of mapper B
> simply by System.out.println().
>
> Q3.
> Harsh J answered it.
>
> Zheyi.
>
> 2013/3/27 Sai Sai <[EMAIL PROTECTED]>
>
>> Q1. Is it right to assume the System.out.println statements are used only
>> in eclipse environment and
>> In a multi node cluster environment we need to use counters.
>>
>> Q2. I am slightly confused as it appears like using System.out.println
>> statements
>> we r able to get detailed info at every line of code in eclipse and
>> counters just give few lines and not as detailed as System.out.println
>> statements do so what should we do in a multi node cluster enivronment.
>>
>> Q3. Also when they say the limit of counters is 120 does that mean that
>> in the output if we use:
>> context.getCounters("TestGroup1","TestName1").increment(1);
>> more than 120 times it will not print it. or does it refer to 120 options
>> of counters in an enum that we can define.
>>
>> Any help is really appreciated.
>> Thanks
>> Sai
>>
>>
>>
>
+
Harsh J 2013-03-27, 10:27
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB