Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # dev >> Shuffling over the network for local map data.


+
Suresh Kumar 2013-01-22, 03:22
+
Steve Loughran 2013-01-22, 16:46
+
Suresh Kumar 2013-01-22, 17:36
+
Suresh Kumar 2013-01-22, 19:02
+
Albert Chu 2013-01-22, 19:42
+
Suresh Kumar 2013-01-22, 23:03
+
Luke Lu 2013-01-22, 19:24
Copy link to this message
-
Re: Shuffling over the network for local map data.
Hi Luke,

I checked the /etc/hosts and it is configured correctly. Looks like the
slow shuffle read speeds we were getting are due to slow disk IO.

I will go through the change MAPREDUCE-4049 and see if I can update my
patch to work with that code on version 3.0.0

I did not think of EC2, that is a good idea.

Thanks,
Suresh.

On Tue, Jan 22, 2013 at 11:24 AM, Luke Lu <[EMAIL PROTECTED]> wrote:

> You can setup the right /etc/hosts to support the loopback. OTOH, saving
> disk io would be more important for small clusters with large instances.
> Hadoop historically works on large clusters with relatively small
> instances, so the issue was not as acute. MAPREDUCE-4049 allows the shuffle
> to be pluggable, so you won't have to patch Hadoop framework code itself.
>
> Are you saying that you don't have access to EC2?
>
>
> On Tue, Jan 22, 2013 at 11:02 AM, Suresh Kumar <[EMAIL PROTECTED]
> >wrote:
>
> > I have a patch that tries to use file links instead of making a copy of
> > the data that is already available locally. I tested it on the a single
> > machine cluster configuration running 48 mappers and reducers. I
> > unfortunately do not have access to a cluster even a small one. Can some
> on
> > review and test run my patch ?
> >
> > I created the patch using Eclipse against 1.0.3. My knowledge in Java in
> > limited and the code is not well written in some classes. So please let
> me
> > know if I need to make changes to the code along with a short explanation
> > of the change.  I will happily do so.
> >
> > Thanks,
> > Suresh.
> >
> >
> >
> >
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB