Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> RE: Shuffle phase replication factor


Copy link to this message
-
Re: Shuffle phase replication factor
In MR1, the tasktracker serves the mapper files (so that tasks don't have
to stick around taking up resources).  In MR2, the shuffle service, which
lives inside the nodemanager, serves them.

-Sandy
On Thu, May 23, 2013 at 10:22 AM, John Lilley <[EMAIL PROTECTED]>wrote:

>  Ling,****
>
> Thanks for the response!  I could use more clarification on item 1.
> Specifically****
>
> **·         **mapred.reduce.parallel.copies  limits the number of
> outbound connections for a reducer, but not the inbound connections for a
> mapper.  Does tasktracker.http.threads limit the number of simultaneous
> inbound connections for a mapper, or only the size of the thread pool
> servicing the connections?  (i.e. is it one thread per inbound connection?).
> ****
>
> **·         **Who actually creates the listen port for serving up the
> mapper files?  The mapper task?  Or something more persistent in MapReduce?
> ****
>
> Thanks,****
>
> John****
>
> ** **
>
> *From:* [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] *On Behalf Of *Kun
> Ling
> *Sent:* Wednesday, May 22, 2013 7:50 PM
> *To:* user
>
> *Subject:* Re: Shuffle phase replication factor****
>
> ** **
>
> Hi John, ****
>
> ** **
>
> ** **
>
>    1. for the number of  simultaneous connection limitations. You can
> configure this using the mapred.reduce.parallel.copies flag. the default
>  is 5. ****
>
> ** **
>
>    2. For the aggressively disconnect implication, I am afraid it is only
> a little. Normally, each reducer will connect to each mapper task, and
> asking for the partions of the map output file.   Because there are about 5
> simultaneous connections to fetch the map output for each reducer. For a
> large MR cluster with 1000 node, and a Huge MR job with 1000 Mapper, and
> 1000 reducer, for each node, there are only about 5 connections. So the
> imply is only a little.****
>
> ** **
>
> ** **
>
>   3.  What happens to the pending/ failing coonection, the short answer
> is: just try to reconnect.    There is a List<>, which maintain all the
> output of the Mapper that need to copied, and the element will be removed
> iff the map output is successfully copied.  A forever loop will keep on
> look into the List, and fetch the corrsponding map output.****
>
> ** **
>
> ** **
>
>   All the above answer is based on the Hadoop 1.0.4 source code,
> especially the ReduceTask.java file.****
>
> ** **
>
> yours,****
>
> Ling Kun****
>
> ** **
>
> On Wed, May 22, 2013 at 10:57 PM, John Lilley <[EMAIL PROTECTED]>
> wrote:****
>
> Ummmm, is that also the limit for the number of simultaneous connections?
> In general, one does not need a 1:1 map between threads and connections.**
> **
>
> If this is the connection limit, does it imply  that the client or server
> side aggressively disconnects after a transfer?  ****
>
> What happens to the pending/failing connection attempts that exceed the
> limit?****
>
> Thanks!****
>
> john****
>
>  ****
>
> *From:* Rahul Bhattacharjee [mailto:[EMAIL PROTECTED]]
> *Sent:* Wednesday, May 22, 2013 8:52 AM****
>
>
> *To:* [EMAIL PROTECTED]
> *Subject:* Re: Shuffle phase replication factor****
>
>  ****
>
> There are properties/configuration to control the no. of copying threads
> for copy.
> tasktracker.http.threads=40
> Thanks,
> Rahul****
>
>  ****
>
> On Wed, May 22, 2013 at 8:16 PM, John Lilley <[EMAIL PROTECTED]>
> wrote:****
>
> This brings up another nagging question I’ve had for some time.  Between
> HDFS and shuffle, there seems to be the potential for “every node
> connecting to every other node” via TCP.  Are there explicit mechanisms in
> place to manage or limit simultaneous connections?  Is the protocol simply
> robust enough to allow a server-side to disconnect at any time to free up
> slots and the client-side will retry the request?****
>
> Thanks****
>
> john****
>
>  ****
>
> *From:* Shahab Yunus [mailto:[EMAIL PROTECTED]]
> *Sent:* Wednesday, May 22, 2013 8:38 AM****
>
>
> *To:* [EMAIL PROTECTED]
> *Subject:* Re: Shuffle phase replication factor****