-Re: Hive parallel execution deadlocks, need restart of yarn-nodemanager
You mentioned you only have one NodeManager.
So, is hive generating 3 MapReduce jobs? And how many map and reduce tasks for each job?
What is your yarn.nodemanager.resource.memory-mb? That determines the maximum number of containers you can run.
You are running into an issue where all the jobs are running in parallel, and because job now has one 'ApplicationMaster' which also occupies a container, the jobs are getting into a scheduling livelock. On single node you will not have enough capacity to run many jobs in parallel.
On Dec 6, 2012, at 5:24 AM, Alexandre Fouche wrote:
> Is there a known deadlock issue or bug when using Hive parallel execution with more parallel hive threads than there are computing nodemanagers ?
> On my test cluster, i have set Hive parallel excution to 2 or 3 threads, and have only 1 computing nodemanager with 5 cpu cores.
> When i run a hive request with a lot of unions that decomposes in a lot of jobs to be executed in parallel, after a few jobs done, it always endup deadlocking on 0% at mapping for all parallel jobs (from Hive0server2 logs). If i restart hadoop-yarn-nodemanager on the nodemanager server, Hive gets out of its deadlock and continues, until getting deadlocked a bit later again.