Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # user - Hadoop JobTracker Hanging


Copy link to this message
-
Re: Hadoop JobTracker Hanging
Ted Yu 2010-06-21, 20:16
Before the new hardware is ready, I suggest you configure jobtracker to
retain fewer jobs in memory - as Todd mentioned.

On Mon, Jun 21, 2010 at 12:49 PM, Bobby Dennett
<[EMAIL PROTECTED]>wrote:

> Thanks all for your suggestions (please note that Tan is my co-worker;
> we are both working to try and resolve this issue)... we experienced
> another hang this weekend and increased the HADOOP_HEAPSIZE setting to
> 6000 (MB) as we do periodically see "java.lang.OutOfMemoryError: Java
> heap space" errors in the jobtracker log. We are now looking into the
> resource allocation of the master node/server to ensure we aren't
> experiencing any issues due to the heap size increase. In parallel, we
> are also working on building "beefier" servers -- stronger CPUs, 3x more
> memory -- for the node running the primary namenode and jobtracker
> processes as well as for the secondary namenode.
>
> Any additional suggestions you might have for troubleshooting/resolving
> this hanging jobtracker issue would be greatly appreciated.
>
> Please note that I had previously started a similar topic on Get
> Satisfaction
> (
> http://www.getsatisfaction.com/cloudera/topics/looking_for_troubleshooting_tips_guidance_for_hanging_jobtracker
> )
> where Todd is helping and the output of jstack and jmap can be found.
>
> Thanks,
> -Bobby
>
> On Fri, 18 Jun 2010 15:04 -0600, "Li, Tan" <[EMAIL PROTECTED]> wrote:
> > Todd,
> > I will try to increase the HADOOP_HEAPSIZE to see if that helps.
> > Tan
> >
> > -----Original Message-----
> > From: Todd Lipcon [mailto:[EMAIL PROTECTED]]
> > Sent: Thursday, June 17, 2010 5:07 PM
> > To: [EMAIL PROTECTED]
> > Subject: Re: Hadoop JobTracker Hanging
> >
> > Li, just to narrow your search, in my experience this is usually caused
> > by
> > OOME on the JT. Check the logs for OutOfMemoryException, see what you
> > find.
> > You may need to configure it to retain fewer jobs in memory, or up your
> > heap.
> >
> > -Todd
> >
> > On Thu, Jun 17, 2010 at 5:03 PM, Li, Tan <[EMAIL PROTECTED]> wrote:
> >
> > > Thanks for your tips, Ted.
> > > All of our QA is done on 0.20.1, and I got a feeling it is not version
> > > related.
> > > I will run jstack and jmap once the problem happens again and I may
> need
> > > your help to analyze the result.
> > >
> > > Tan
> > >
> > > -----Original Message-----
> > > From: Ted Yu [mailto:[EMAIL PROTECTED]]
> > > Sent: Thursday, June 17, 2010 2:39 PM
> > > To: [EMAIL PROTECTED]
> > > Subject: Re: Hadoop JobTracker Hanging
> > >
> > > Is upgrading to hadoop-0.20.2+228 possible ?
> > >
> > > Use jstack to get stack trace of job tracker process when this happens
> > > again.
> > > Use jmap to get shared object memory maps or heap memory details.
> > >
> > > On Thu, Jun 17, 2010 at 2:00 PM, Li, Tan <[EMAIL PROTECTED]> wrote:
> > >
> > > > Folks,
> > > >
> > > > I need some help on job tracker.
> > > > I am running a two hadoop clusters (with 30+ nodes) on Ubuntu. One is
> > > with
> > > > version 0.19.1 (apache) and the other one is with version 0.20.
> 1+169.68
> > > > (Cloudera).
> > > >
> > > > I have the same problem with both the clusters: the job tracker hangs
> > > > almost once a day.
> > > > Symptom: The job tracker web page can not be loaded, the command
> "hadoop
> > > > job -list" hangs and jobtracker.log file stops being updated.
> > > > No useful information can I find in the job tracker log file.
> > > > The symptom is gone after I restart the job tracker and the cluster
> runs
> > > > fine for another 20+ hour period. And then the symptom comes back.
> > > >
> > > > I do not have serious problem with HDFS.
> > > >
> > > > Any ideas about the causes? Any configuration parameter that I can
> change
> > > > to reduce the chances of the problem?
> > > > Any tips for diagnosing and troubleshooting?
> > > >
> > > > Thanks!
> > > >
> > > > Tan
> > > >
> > > >
> > > >
> > > >
> > >
> >
> >
> >
> > --
> > Todd Lipcon
> > Software Engineer, Cloudera
> >
>