On Thu, Dec 9, 2010 at 4:35 PM, Pedro Costa <[EMAIL PROTECTED]> wrote:
> 1 - Hadoop MR contains a TaskMemoryManagerThread class that is used to
> manage memory usage of tasks running under a TaskTracker. Why Hadoop
> MR needs a class to manage memory? Why it couldn't rely on the JVM, or
> this class is here for another purpose?
There are streaming and pipes map/reduce applications that launch
native processes from the map/reduce tasks that are outside the
control of the JVM. Indeed, even regular Java map/reduce programs
could fork/exec other programs. All of these processes could consume
memory that would not be accounted for if we relied only on the JVM to
get the memory usage. Hence a separate class that looks at the entire
process tree of the map/reduce task to account for memory consumed.
> 2 - How the JT knows that a Map or Reduce Task finished? Is through
> the heartbeat?