Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> Multi-threaded map task


+
Mark Olimpiati 2013-01-14, 05:57
+
Nitin Pawar 2013-01-14, 06:34
+
Mark Olimpiati 2013-01-14, 07:22
Copy link to this message
-
Re: Multi-threaded map task
Well... It all depends on where is your bottleneck. Do a benchmark for your
use case if it is critical. Multi-threading might be useful not always. And
you would rather want to avoid having a locally shared mutable state
because it can become a pain to manage. But it doesn't mean you can't do
multi-threading...

You only need to browse the type hierarchy a bit to find about
http://hadoop.apache.org/docs/r1.0.4/api/org/apache/hadoop/mapreduce/lib/map/MultithreadedMapper.html

Regards

Bertrand

On Mon, Jan 14, 2013 at 8:22 AM, Mark Olimpiati <[EMAIL PROTECTED]> wrote:

> Thanks for the reply Nitin, but I don't see what's the bottleneck of having
> it distributed with multi-threaded maps ?
>
> I see your point in that each map is processing different splits, but my
> question is if each map task had 2 threads multiplexing  or running in
> parallel if there is enough cores to process the same split, wouldn't that
> be faster with enough cores?
>
> Mark
>
>
> On Sun, Jan 13, 2013 at 10:34 PM, Nitin Pawar <[EMAIL PROTECTED]
> >wrote:
>
> > Thats because its distributed processing framework over network
> > On Jan 14, 2013 11:27 AM, "Mark Olimpiati" <[EMAIL PROTECTED]> wrote:
> >
> > > Hi, this is a simple question, but why wasn't map or reduce tasks
> > > programmed to be multi-threaded ? ie. instead of spawning 6 map tasks
> > for 6
> > > cores, run one map task with 6 parallel threads.
> > >
> > > In fact I tried this myself, but turns that threading is not helping as
> > it
> > > would be in regular java programs for some reason .. any feedback on
> this
> > > topic?
> > >
> > > Thanks,
> > > Mark
> > >
> >
>

--
Bertrand Dechoux
+
Mark Olimpiati 2013-01-14, 20:23
+
Mark Olimpiati 2013-01-14, 20:44
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB