Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop, mail # user - RE: Shuffle phase replication factor


+
John Lilley 2013-05-22, 14:46
+
John Lilley 2013-05-23, 17:22
Copy link to this message
-
Re: Shuffle phase replication factor
Sandy Ryza 2013-05-23, 17:24
In MR1, the tasktracker serves the mapper files (so that tasks don't have
to stick around taking up resources).  In MR2, the shuffle service, which
lives inside the nodemanager, serves them.

-Sandy
On Thu, May 23, 2013 at 10:22 AM, John Lilley <[EMAIL PROTECTED]>wrote:

>  Ling,****
>
> Thanks for the response!  I could use more clarification on item 1.
> Specifically****
>
> **·         **mapred.reduce.parallel.copies  limits the number of
> outbound connections for a reducer, but not the inbound connections for a
> mapper.  Does tasktracker.http.threads limit the number of simultaneous
> inbound connections for a mapper, or only the size of the thread pool
> servicing the connections?  (i.e. is it one thread per inbound connection?).
> ****
>
> **·         **Who actually creates the listen port for serving up the
> mapper files?  The mapper task?  Or something more persistent in MapReduce?
> ****
>
> Thanks,****
>
> John****
>
> ** **
>
> *From:* [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] *On Behalf Of *Kun
> Ling
> *Sent:* Wednesday, May 22, 2013 7:50 PM
> *To:* user
>
> *Subject:* Re: Shuffle phase replication factor****
>
> ** **
>
> Hi John, ****
>
> ** **
>
> ** **
>
>    1. for the number of  simultaneous connection limitations. You can
> configure this using the mapred.reduce.parallel.copies flag. the default
>  is 5. ****
>
> ** **
>
>    2. For the aggressively disconnect implication, I am afraid it is only
> a little. Normally, each reducer will connect to each mapper task, and
> asking for the partions of the map output file.   Because there are about 5
> simultaneous connections to fetch the map output for each reducer. For a
> large MR cluster with 1000 node, and a Huge MR job with 1000 Mapper, and
> 1000 reducer, for each node, there are only about 5 connections. So the
> imply is only a little.****
>
> ** **
>
> ** **
>
>   3.  What happens to the pending/ failing coonection, the short answer
> is: just try to reconnect.    There is a List<>, which maintain all the
> output of the Mapper that need to copied, and the element will be removed
> iff the map output is successfully copied.  A forever loop will keep on
> look into the List, and fetch the corrsponding map output.****
>
> ** **
>
> ** **
>
>   All the above answer is based on the Hadoop 1.0.4 source code,
> especially the ReduceTask.java file.****
>
> ** **
>
> yours,****
>
> Ling Kun****
>
> ** **
>
> On Wed, May 22, 2013 at 10:57 PM, John Lilley <[EMAIL PROTECTED]>
> wrote:****
>
> Ummmm, is that also the limit for the number of simultaneous connections?
> In general, one does not need a 1:1 map between threads and connections.**
> **
>
> If this is the connection limit, does it imply  that the client or server
> side aggressively disconnects after a transfer?  ****
>
> What happens to the pending/failing connection attempts that exceed the
> limit?****
>
> Thanks!****
>
> john****
>
>  ****
>
> *From:* Rahul Bhattacharjee [mailto:[EMAIL PROTECTED]]
> *Sent:* Wednesday, May 22, 2013 8:52 AM****
>
>
> *To:* [EMAIL PROTECTED]
> *Subject:* Re: Shuffle phase replication factor****
>
>  ****
>
> There are properties/configuration to control the no. of copying threads
> for copy.
> tasktracker.http.threads=40
> Thanks,
> Rahul****
>
>  ****
>
> On Wed, May 22, 2013 at 8:16 PM, John Lilley <[EMAIL PROTECTED]>
> wrote:****
>
> This brings up another nagging question I’ve had for some time.  Between
> HDFS and shuffle, there seems to be the potential for “every node
> connecting to every other node” via TCP.  Are there explicit mechanisms in
> place to manage or limit simultaneous connections?  Is the protocol simply
> robust enough to allow a server-side to disconnect at any time to free up
> slots and the client-side will retry the request?****
>
> Thanks****
>
> john****
>
>  ****
>
> *From:* Shahab Yunus [mailto:[EMAIL PROTECTED]]
> *Sent:* Wednesday, May 22, 2013 8:38 AM****
>
>
> *To:* [EMAIL PROTECTED]
> *Subject:* Re: Shuffle phase replication factor****