Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # user - MR job scheduler


Copy link to this message
-
Re: MR job scheduler
bharath vissapragada 2009-08-21, 16:57
I discussed the same doubt in Hbase forums .. Iam pasting the reply i got
(for those who aren't subscribed to that list)

Regarding optimizing the reduce phase(similar to what harish was pointing
out)

I got the following reply .. frm Ryan

"I think people are confused about how optimal map reduces have to be.
Keeping all the data super-local on each machine is not always helping
you, since you have to read via a socket anyways. Going remote doesn't
actually make things that much slower, since on a modern lan ping
times are < 0.1ms.  If your entire cluster is hanging off a single
switch, there is nearly unlimited bandwidth between all nodes
(certainly much higher than any single system could push).  Only once
you go multi-switch then switch-locality (aka rack locality) becomes
important.

Remember, hadoop isn't about the instantaneous speed of any job, but
about running jobs in a highly scalable manner that works on tens or
tens of thousands of nodes. You end up blocking on single machine
limits anyways, and the r=3 of HDFS helps you transcend a single
machine read speed for large files. Keeping the data transfer local in
this case results in lower performance."

Just FYI!
Thanks

On Fri, Aug 21, 2009 at 1:43 PM, Harish Mallipeddi <
[EMAIL PROTECTED]> wrote:

> On Fri, Aug 21, 2009 at 12:11 PM, bharath vissapragada <
> [EMAIL PROTECTED]> wrote:
>
> > Yes , My doubt is that how is the location of the reducer selected . Is
> it
> > selected arbitrarily or is selected on a particular machine which has
> > already the more values (corresponding to the key of that reducer) which
> > reduces the cost of transferring data across the network(because already
> > many values to that key are on that machine where the map phase
> > completed)..
> >
>

I discussed the same issue on hbase forums and one of its developers
answered my questi

>
> I think what you're asking for is whether a ReduceTask is scheduled on a
> node which has the largest partition among all the mapoutput partitions
> (p1-pN) that the ReduceTask has to fetch in order to do its job. The answer
> is "no" - the ReduceTasks are assigned arbitrarily (no such optimization is
> done and I think this can really be an optimization only if 1 of your
> partitions is heavily skewed for some reason). Also as Amogh pointed out,
> the ReduceTasks start fetching their mapoutput-partitions (shuffle phase)
> as
> and when they hear about completed ones. So it would not be possible to
> schedule ReduceTasks only on nodes with the largest partitions.
>
> --
> Harish Mallipeddi
> http://blog.poundbang.in
>