Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> Re: VM reuse!

Ok, Thanks Bejoy.

Only in some typical scenarios it's possible , like the one that you have
Much more number of mappers and less number of mappers slots.

On Tue, Apr 16, 2013 at 2:40 PM, Bejoy Ks <[EMAIL PROTECTED]> wrote:

> Hi Rahul
> If you look at larger cluster and jobs that involve larger input data
> sets. The data would be spread across the whole cluster, and a single node
> might have  various blocks of that entire data set. Imagine you have a
> cluster with 100 map slots and your job has 500 map tasks, now in that case
> there should be multiple map tasks in a single task tracker based on slot
> availability.
> Here if you enable jvm reuse, all tasks related to a job on a single
> TaskTracker would use the same jvm. The benefit here is just the time you
> are saving in spawning and cleaning up jvm for individual tasks.
> On Tue, Apr 16, 2013 at 2:04 PM, Rahul Bhattacharjee <
>> Hi,
>> I have a question related to VM reuse in Hadoop.I now understand the
>> purpose of VM reuse , but I am wondering how is it useful.
>> Example. for VM reuse to be effective or kicked in , we need more than
>> one mapper task to be submitted to a single node (for the same job).Hadoop
>> would consider spawning mappers into nodes which actually contains the data
>> , it might rarely happen that multiple mappers are allocated to a single
>> task tracker. And even if a single task nodes gets to run multiple mappers
>> then it might as well run in parallel in multiple VM rather than
>> sequentially in a single VM.
>> I am sure I am missing some link here , please help me find that.
>> Thanks,
>> Rahul
bejoy.hadoop@... 2013-04-16, 16:45