Any comments/corrections on my understanding on Uber?
Thanks in advance!
2013/11/8 sam liu <[EMAIL PROTECTED]>
> Hi Experts,
> In previous discussions, I found following descriptions:
> "mapreduce.job.ubertask.enable | (false) | 'Whether to enable the
> small-jobs "ubertask" optimization, which runs "sufficiently small" jobs
> sequentially within a single JVM. "Small" is defined by the following
> maxmaps, maxreduces, and maxbytes settings. Users may override this value.'"
> Basing on above description, I set "mapreduce.job.ubertask.enable" to true
> and also configured other uber related parameters, and then I did some
> practices and have following understanding.
> 1) If I submit a bunch of small MR jobs to Hadoop cluster(each MR job will
> run in uber mode):
> - Each MR job corresponds to an application, like
> - Each application has its own container, like
> - When a container launched by nodemanager, it will launch a JVM too.
> When the container stops, the JVM will stop as well. A container only has
> one JVM in its whole life cycle.
> - Each application_1383815949546_0006 includes some map tasks and
> reduce tasks
> - In uber mode, all the map tasks and reduce tasks of
> application_1383815949546_0006 will be executed in a the same and only
> container container_1383815949546_0010_01_000001. It also means that all
> map tasks and reduce tasks will be executed in a single JVM.
> - A container could not be shared among different applications(jobs)
> 2) If I submit a bunch of big MR jobs to Hadoop cluster(each MR job will
> run and NOT in uber mode):
> - Each map task and reduce task of application_1383815949546_0006 will
> be executed in its own container. It means that
> application_1383815949546_0006 will have lots of containers.
> I am not sure whether above undertandings are correct or not, so any
> comments/corrections will be appreciated!