Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> JVM reuse in Map Tasks


Copy link to this message
-
Re: JVM reuse in Map Tasks
Hi Arpit,

A point to mention from http://www.cloudera.com/blog/2009/12/7-tips-for-improving-mapreduce-performance/:

If each task takes less than 30-40 seconds, reduce the number of tasks. The task setup and scheduling overhead is a few seconds, so if tasks finish very quickly, you’re wasting time while not doing work. JVM reuse can also be enabled to solve this problem.

Further I can think if we create a huge tree in the mapper phase in a Child JVM(lets say implementation needs a huge tree to be created), same can be re-used across the JVMs rather than creating again and again.

Cheers,
Subroto Sanyal

On Jun 4, 2012, at 2:12 PM, Arpit Wanchoo wrote:

> Hi
>
> I wanted to check what exactly we gain  when JVM reusability is enabled in mapped job.
>
> My doubt was regarding the setup() method of mapper. Is it called for a mapper even if it is using the JVM for previously run mapper ?
> If yes then is there any way I can control it or stop from being called more than once.
>
> Regards,
> Arpit Wanchoo | Sr. Software Engineer
> Guavus Network Systems.
> 6th Floor, Enkay Towers, Tower B & B1,Vanijya Nikunj, Udyog Vihar Phase - V, Gurgaon,Haryana.
> Mobile Number +91-9899949788
>