Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # dev >> "Map input bytes" vs HDFS_BYTES_READ


Copy link to this message
-
Re: "Map input bytes" vs HDFS_BYTES_READ
Harsh:
When LocalJobRunner is used, only "Map input bytes" is calculated.

Can you comment on this case ?

Thanks

On Tue, Feb 1, 2011 at 7:52 PM, Harsh J <[EMAIL PROTECTED]> wrote:

> Each task counts independently of its attempt/other tasks, thereby
> making the aggregates easier to control. Final counters are aggregated
> only from successfully committed tasks. During the job's run, however,
> counters are shown aggregated from the most successful attempts of a
> task thus far.
>
> On Wed, Feb 2, 2011 at 9:09 AM, Ted Yu <[EMAIL PROTECTED]> wrote:
> > If map task(s) were retried (mapred.map.max.attempts times), how would
> these
> > two counters be affected ?
> >
> > Thanks
> >
> > On Tue, Feb 1, 2011 at 7:31 PM, Harsh J <[EMAIL PROTECTED]> wrote:
> >
> >> HDFS_BYTES_READ is a FileSystem interface counter. It directly deals
> >> with the FS read (lower level). Map input bytes is what the
> >> RecordReader has processed in number of bytes for records being read
> >> from the input stream.
> >>
> >> For plain text files, I believe both counters must report about the
> >> same value, were entire records being read with no operation performed
> >> on each line. But when you throw in a compressed file, you'll notice
> >> that the HDFS_BYTES_READ would be far lesser than Map input bytes
> >> since the disk read was low, but the total content stored in record
> >> terms was still the same as it would be for an uncompressed file.
> >>
> >> Hope this clears it.
> >>
> >> On Wed, Feb 2, 2011 at 8:06 AM, Ted Yu <[EMAIL PROTECTED]> wrote:
> >> > In hadoop 0.20.2, what's the relationship between "Map input bytes"
> and
> >> > HDFS_BYTES_READ ?
> >> >
> >> > <counter group="FileSystemCounters"
> >> > name="HDFS_BYTES_READ">203446204073</counter>
> >> > <counter group="FileSystemCounters"
> >> > name="HDFS_BYTES_WRITTEN">23413127561</counter>
> >> > <counter group="Map-Reduce Framework" name="Map input
> >> > records">163502600</counter>
> >> > <counter group="Map-Reduce Framework" name="Spilled
> Records">0</counter>
> >> > <counter group="Map-Reduce Framework" name="Map input
> >> > bytes">965922136488</counter>
> >> > <counter group="Map-Reduce Framework" name="Map output
> >> > records">296754600</counter>
> >> >
> >> > Thanks
> >> >
> >>
> >>
> >>
> >> --
> >> Harsh J
> >> www.harshj.com
> >>
> >
>
>
>
> --
> Harsh J
> www.harshj.com
>