-Re: Doubt from the book "Definitive Guide"
Mohit Anchlia 2012-04-05, 14:03
On Wed, Apr 4, 2012 at 10:02 PM, Prashant Kommireddi <[EMAIL PROTECTED]>wrote:
> Hi Mohit,
> What would be the advantage? Reducers in most cases read data from all
> the mappers. In the case where mappers were to write to HDFS, a
> reducer would still require to read data from other datanodes across
> the cluster.
Only advantage I was thinking of was that in some cases reducers might be
able to take advantage of data locality and avoid multiple HTTP calls, no?
Data is anyways written, so last merged file could go on HDFS instead of
I am new to hadoop so just asking question to understand the rational
behind using local disk for final output.
> On Apr 4, 2012, at 9:55 PM, Mohit Anchlia <[EMAIL PROTECTED]> wrote:
> > On Wed, Apr 4, 2012 at 8:42 PM, Harsh J <[EMAIL PROTECTED]> wrote:
> >> Hi Mohit,
> >> On Thu, Apr 5, 2012 at 5:26 AM, Mohit Anchlia <[EMAIL PROTECTED]>
> >> wrote:
> >>> I am going through the chapter "How mapreduce works" and have some
> >>> confusion:
> >>> 1) Below description of Mapper says that reducers get the output file
> >> using
> >>> HTTP call. But the description under "The Reduce Side" doesn't
> >> specifically
> >>> say if it's copied using HTTP. So first confusion, Is the output copied
> >>> from mapper -> reducer or from reducer -> mapper? And second, Is the
> >>> http:// or hdfs://
> >> The flow is simple as this:
> >> 1. For M+R job, map completes its task after writing all partitions
> >> down into the tasktracker's local filesystem (under mapred.local.dir
> >> directories).
> >> 2. Reducers fetch completion locations from events at JobTracker, and
> >> query the TaskTracker there to provide it the specific partition it
> >> needs, which is done over the TaskTracker's HTTP service (50060).
> >> So to clear things up - map doesn't send it to reduce, nor does reduce
> >> ask the actual map task. It is the task tracker itself that makes the
> >> bridge here.
> >> Note however, that in Hadoop 2.0 the transfer via ShuffleHandler would
> >> be over Netty connections. This would be much more faster and
> >> reliable.
> >>> 2) My understanding was that mapper output gets written to hdfs, since
> >> I've
> >>> seen part-m-00000 files in hdfs. If mapper output is written to HDFS
> >>> shouldn't reducers simply read it from hdfs instead of making http
> >> to
> >>> tasktrackers location?
> >> A map-only job usually writes out to HDFS directly (no sorting done,
> >> cause no reducer is involved). If the job is a map+reduce one, the
> >> default output is collected to local filesystem for partitioning and
> >> sorting at map end, and eventually grouping at reduce end. Basically:
> >> Data you want to send to reducer from mapper goes to local FS for
> >> multiple actions to be performed on them, other data may directly go
> >> to HDFS.
> >> Reducers currently are scheduled pretty randomly but yes their
> >> scheduling can be improved for certain scenarios. However, if you are
> >> pointing that map partitions ought to be written to HDFS itself (with
> >> replication or without), I don't see performance improving. Note that
> >> the partitions aren't merely written but need to be sorted as well (at
> >> either end). To do that would need ability to spill frequently (cause
> >> we don't have infinite memory to do it all in RAM) and doing such a
> >> thing on HDFS would only mean slowdown.
> >> Thanks for clearing my doubts. In this case I was merely suggesting that
> > if the mapper output (merged output in the end or the shuffle output) is
> > stored in HDFS then reducers can just retrieve it from HDFS instead of
> > asking tasktracker for it. Once reducer threads read it they can continue
> > to work locally.
> >> I hope this helps clear some things up for you.
> >> --
> >> Harsh J