-Re: Hive parallel execution deadlocks, need restart of yarn-nodemanager
Alexandre Fouche 2012-12-07, 08:18
Ah i see, i had missed the fact that each MR jobs had an ApplicationManager that was taking a container, there were none free to run mappers (my jobs usually have only one mapper due to small input data). I understood that thanks to your explanations and using more nodes with a greater concurrency, and like before all containers were running an ApplicationManager !
Thank you very much !
Lead operations engineer, cloud architect
http://www.cleverscale.com | @cleverscale
Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
On Thursday 6 December 2012 at 21:08, Vinod Kumar Vavilapalli wrote:
> You mentioned you only have one NodeManager.
> So, is hive generating 3 MapReduce jobs? And how many map and reduce tasks for each job?
> What is your yarn.nodemanager.resource.memory-mb? That determines the maximum number of containers you can run.
> You are running into an issue where all the jobs are running in parallel, and because job now has one 'ApplicationMaster' which also occupies a container, the jobs are getting into a scheduling livelock. On single node you will not have enough capacity to run many jobs in parallel.
> On Dec 6, 2012, at 5:24 AM, Alexandre Fouche wrote:
> > Is there a known deadlock issue or bug when using Hive parallel execution with more parallel hive threads than there are computing nodemanagers ?
> > On my test cluster, i have set Hive parallel excution to 2 or 3 threads, and have only 1 computing nodemanager with 5 cpu cores.
> > When i run a hive request with a lot of unions that decomposes in a lot of jobs to be executed in parallel, after a few jobs done, it always endup deadlocking on 0% at mapping for all parallel jobs (from Hive0server2 logs). If i restart hadoop-yarn-nodemanager on the nodemanager server, Hive gets out of its deadlock and continues, until getting deadlocked a bit later again.
> > Alex