Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Region server shutting down due to HDFS error


Copy link to this message
-
Re: Region server shutting down due to HDFS error
hmmm... I couldn't find it either, so I've looked at the history of that
file and sure enough a few check-ins back it had that message.
I have no idea how something like this could happen. I know I had some
merge issues when I first got the latest version and built that project but
I've then reverted all local changes and rebuilt. The only thing I can
imagine is that the previous compiled class file was not modified and it
was the one that got included in the JAR, although I don;t really know how
can it happen.

-eran

On Wed, Mar 28, 2012 at 18:53, Ted Yu <[EMAIL PROTECTED]> wrote:

> Eran:
> The error indicated some zookeeper related issue.
> Do you see KeeperException after the Error log ?
>
> I searched 90 codebase but couldn't find the exact log phrase:
>
> zhihyu$ find src/main -name '*.java' -exec grep "getting node's version in
> CLOSI" {} \; -print
> zhihyu$ find src/main -name '*.java' -exec grep 'Error getting ' {} \;
> -print
>
> Cheers
>
> On Wed, Mar 28, 2012 at 9:45 AM, Eran Kutner <[EMAIL PROTECTED]> wrote:
>
> > I don't see any prior HDFS issues in the 15 minutes before this
> exception.
> > The logs on the datanode reported as problematic are clean as well.
> > However, I now see the log is full of errors like this:
> > 2012-03-28 00:15:05,358 DEBUG
> > org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler:
> Processing
> > close of gs_users,731481|S
> > n쒪㝨眳ԫ䂣⫰==,1331226388691.29929cb2200b3541ead85e17b836ade5.
> > 2012-03-28 00:15:05,359 WARN
> > org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler: Error
> > getting node's version in CLOSIN
> > G state, aborting close of
> >
> gs_users,731481|Sn쒪㝨眳ԫ䂣���==,1331226388691.29929cb2200b3541ead85e17b836ade5.
> >
> > -eran
> >
> >
> >
> > On Wed, Mar 28, 2012 at 18:38, Jean-Daniel Cryans <[EMAIL PROTECTED]
> > >wrote:
> >
> > > Any chance we can see what happened before that too? Usually you
> > > should see a lot more HDFS spam before getting that all the datanodes
> > > are bad.
> > >
> > > J-D
> > >
> > > On Wed, Mar 28, 2012 at 4:28 AM, Eran Kutner <[EMAIL PROTECTED]> wrote:
> > > > Hi,
> > > >
> > > > We have region server sporadically stopping under load due supposedly
> > to
> > > > errors writing to HDFS. Things like:
> > > >
> > > > 2012-03-28 00:37:11,210 WARN org.apache.hadoop.hdfs.DFSClient: Error
> > > while
> > > > syncing
> > > > java.io.IOException: All datanodes 10.1.104.10:50010 are bad.
> > Aborting..
> > > >
> > > > It's happening with a different region server and data node every
> time,
> > > so
> > > > it's not a problem with one specific server and there doesn't seem to
> > be
> > > > anything really wrong with either of them. I've already increased the
> > > file
> > > > descriptor limit, datanode xceivers and data node handler count. Any
> > idea
> > > > what can be causing these errors?
> > > >
> > > >
> > > > A more complete log is here: http://pastebin.com/wC90xU2x
> > > >
> > > > Thanks.
> > > >
> > > > -eran
> > >
> >
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB