Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> JVM reuse in Map Tasks


Copy link to this message
-
Re: JVM reuse in Map Tasks
Hi Arpit,

A point to mention from http://www.cloudera.com/blog/2009/12/7-tips-for-improving-mapreduce-performance/:

If each task takes less than 30-40 seconds, reduce the number of tasks. The task setup and scheduling overhead is a few seconds, so if tasks finish very quickly, you’re wasting time while not doing work. JVM reuse can also be enabled to solve this problem.

Further I can think if we create a huge tree in the mapper phase in a Child JVM(lets say implementation needs a huge tree to be created), same can be re-used across the JVMs rather than creating again and again.

Cheers,
Subroto Sanyal

On Jun 4, 2012, at 2:12 PM, Arpit Wanchoo wrote:

> Hi
>
> I wanted to check what exactly we gain  when JVM reusability is enabled in mapped job.
>
> My doubt was regarding the setup() method of mapper. Is it called for a mapper even if it is using the JVM for previously run mapper ?
> If yes then is there any way I can control it or stop from being called more than once.
>
> Regards,
> Arpit Wanchoo | Sr. Software Engineer
> Guavus Network Systems.
> 6th Floor, Enkay Towers, Tower B & B1,Vanijya Nikunj, Udyog Vihar Phase - V, Gurgaon,Haryana.
> Mobile Number +91-9899949788
>

NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB