Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce, mail # user - One mapper/reducer runs on a single JVM


Copy link to this message
-
Re: One mapper/reducer runs on a single JVM
Michael Segel 2012-11-06, 16:27
If you exceed the amount of physical memory available, memory pages will be written to disk in a temp space. The act of 'swapping' the memory pages from memory to disk and back again is known as 'swap'.

HBase is highly sensitive to the latency of swapping memory in and out of physical memory to disk. You need to avoid swap when running HBase.  It will crash a region server and ultimately you can end up with a cascading failure and HBase will go down.

HTH

-Mike

On Nov 5, 2012, at 11:06 PM, Lin Ma <[EMAIL PROTECTED]> wrote:

> Thanks Michael,
>
> "If you are running just Hadoop, you could have a little swap. Running HBase, fuggit about it." -- could you give a bit more information about what do you mean swap and why forget for HBase?
>
> regards,
> Lin
>
>
> On Tue, Nov 6, 2012 at 12:46 PM, Michael Segel <[EMAIL PROTECTED]> wrote:
> Mappers and Reducers are separate JVM processes.
> And yes you need to take in to account the amount of memory the machine(s) when you configure the number of slots.
>
> If you are running just Hadoop, you could have a little swap. Running HBase, fuggit about it.
>
>
> On Nov 5, 2012, at 7:12 PM, Lin Ma <[EMAIL PROTECTED]> wrote:
>
> > Hello Hadoop experts,
> >
> > I have a question in my mind for a long time. Supposing I am developing M-R program, and it is Java based (Java UDF, implements mapper or reducer interface). My question is, in this scenario, whether a mapper or a reducer is a separate JVM process? E.g. supposing on a machine, there are 4 mappers, they are 4 individual processes? I am also wondering whether the processes on a single machine will impact each other when each JVM wants to get more memory to run faster?
> >
> > thanks in advance,
> > Lin
> >
> >
>
>