Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> HBase 0.90.0 region servers dying


+
Enis Soztutar 2011-02-16, 08:40
+
Ryan Rawson 2011-02-16, 08:46
+
Ted Dunning 2011-02-16, 09:00
+
Eric 2011-02-16, 13:13
+
Enis Soztutar 2011-02-18, 06:14
+
Jean-Daniel Cryans 2011-02-18, 19:50
Copy link to this message
-
Re: HBase 0.90.0 region servers dying
Yes indeed but no luck.

Enis

On Fri, Feb 18, 2011 at 11:50 AM, Jean-Daniel Cryans <[EMAIL PROTECTED]>wrote:

> Just to make sure, you did check in the .out file after a failure right?
>
> J-D
>
> On Thu, Feb 17, 2011 at 10:14 PM, Enis Soztutar
> <[EMAIL PROTECTED]> wrote:
> > Hi,
> >
> > Thanks everyone for the answers.
> > I had already  increase the file descriptors to 32768. The region servers
> > and the zookeeper processes are dying, but datanode and tasktrackers keep
> > running (they are configured with a max heap of 1Gb). The logs do not
> > contain any indication that something is going wrong. The last info on
> the
> > logs are typical INFO level logs.  I have also checked for kernel logs,
> but
> > kernel does not report that it is killing the processes either. While
> > testing, two of the servers restarted at different times, which was the
> > original reason that I had suspected a memory error. But after we
> replaced
> > the power supplies, nodes did not restart, but the processes kept dying.
> >
> > For the load, the ycsb test for 10M records goes on for a while at 4K
> > inserts per sec, but cannot complete due to region servers dying one by
> one.
> > iostat also shows light cpu and io utilization around 20%. Any more
> > suggestions for debugging would be more than welcome.
> >
> > Thanks,
> > Enis
> >
> > On Wed, Feb 16, 2011 at 5:13 AM, Eric <[EMAIL PROTECTED]> wrote:
> >
> >> Did you increase the max open files on your system (in
> >> /etc/security/limits.conf) ?
> >>
> >
> >> 2011/2/16 Enis Soztutar <[EMAIL PROTECTED]>
> >>
> >> > Hi,
> >> >
> >> > We have a newly setup a cluster of 5 nodes, each with 16 GB rams. We
> use
> >> > HBase 0.90.0 on top of Hadoop from CDH3. When testing HBase under
> heavy
> >> > load
> >> > generated bu YCSB, we consistently see region servers dying silently,
> >> > without any logs or exceptions (not even in system logs). We couldn't
> >> track
> >> > down the problem, so we have  tested the same setup on a rackspace
> >>  cluster
> >> > with 7 nodes but similar hardware, and we didn't have any problem.
> >> >
> >> > We are suspecting a problem with the rams, or motherboards, but all
> >> memory
> >> > tests run successfully. I was wondering if anyone had similar problems
> >> > before and is there anything you suggest to nail down the issue.
> >> >
> >> > Thanks,
> >> > Enis
> >> >
> >>
> >
>
+
Jean-Daniel Cryans 2011-02-22, 20:34
+
Stack 2011-02-22, 21:18
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB