|
Bryan Beaudreault
2013-03-04, 17:00
Ted Yu
2013-03-04, 17:04
Bryan Beaudreault
2013-03-04, 17:20
Ted Yu
2013-03-04, 17:23
Bryan Beaudreault
2013-03-04, 17:26
Jean-Daniel Cryans
2013-03-04, 17:41
Ted Yu
2013-03-04, 17:51
Bryan Beaudreault
2013-03-04, 17:53
|
-
0.94.2 failing regionserversBryan Beaudreault 2013-03-04, 17:00
We recently upgraded multiple clusters to CDH 4.2, which comes with hbase
0.94.2 Since then We've seen region servers die periodically in a way I never saw before on CDH3. Here are the exceptions: First I see a slew of these: http://pastebin.com/WqSwMzuZ Then the regionserver starts closing all its regions, after throwing this exception: http://pastebin.com/396wX3iw This has now happened on multiple servers across multiple clusters, all cdh 4.2. Any thoughts? During the time our NameNodes seem to be doing fine and I don't see any issue on our datanodes either.
-
Re: 0.94.2 failing regionserversTed Yu 2013-03-04, 17:04
In the stack trace, I see HFileReaderV1.java
I would expect your store files to be upgraded to HFile v2 after cluster upgrade. Can you tell us more about how you upgraded your clusters ? Thanks On Mon, Mar 4, 2013 at 9:00 AM, Bryan Beaudreault <[EMAIL PROTECTED]>wrote: > We recently upgraded multiple clusters to CDH 4.2, which comes with hbase > 0.94.2 Since then We've seen region servers die periodically in a way I > never saw before on CDH3. > > Here are the exceptions: > > First I see a slew of these: http://pastebin.com/WqSwMzuZ > > Then the regionserver starts closing all its regions, after throwing this > exception: http://pastebin.com/396wX3iw > > This has now happened on multiple servers across multiple clusters, all cdh > 4.2. > > Any thoughts? During the time our NameNodes seem to be doing fine and I > don't see any issue on our datanodes either. >
-
Re: 0.94.2 failing regionserversBryan Beaudreault 2013-03-04, 17:20
Interesting.
We upgraded by creating entirely new clusters with the CDH4.2 software installed onto it. Then we distributed copied the /hbase directory to the new clusters. We haven't run a major compaction since then, so that could be the reason there are still v1 HFiles. This migration took place just a couple days ago. On Mon, Mar 4, 2013 at 12:04 PM, Ted Yu <[EMAIL PROTECTED]> wrote: > In the stack trace, I see HFileReaderV1.java > I would expect your store files to be upgraded to HFile v2 after cluster > upgrade. > > Can you tell us more about how you upgraded your clusters ? > > Thanks > > On Mon, Mar 4, 2013 at 9:00 AM, Bryan Beaudreault > <[EMAIL PROTECTED]>wrote: > > > We recently upgraded multiple clusters to CDH 4.2, which comes with hbase > > 0.94.2 Since then We've seen region servers die periodically in a way I > > never saw before on CDH3. > > > > Here are the exceptions: > > > > First I see a slew of these: http://pastebin.com/WqSwMzuZ > > > > Then the regionserver starts closing all its regions, after throwing this > > exception: http://pastebin.com/396wX3iw > > > > This has now happened on multiple servers across multiple clusters, all > cdh > > 4.2. > > > > Any thoughts? During the time our NameNodes seem to be doing fine and I > > don't see any issue on our datanodes either. > > >
-
Re: 0.94.2 failing regionserversTed Yu 2013-03-04, 17:23
In hindsight, this should have helped:
http://hbase.apache.org/book.html#upgrade0.94 I would suggest upgrading existing HFile v1 files to v2 format. Cheers On Mon, Mar 4, 2013 at 9:20 AM, Bryan Beaudreault <[EMAIL PROTECTED]>wrote: > Interesting. > > We upgraded by creating entirely new clusters with the CDH4.2 software > installed onto it. Then we distributed copied the /hbase directory to the > new clusters. We haven't run a major compaction since then, so that could > be the reason there are still v1 HFiles. This migration took place just a > couple days ago. > > > On Mon, Mar 4, 2013 at 12:04 PM, Ted Yu <[EMAIL PROTECTED]> wrote: > > > In the stack trace, I see HFileReaderV1.java > > I would expect your store files to be upgraded to HFile v2 after cluster > > upgrade. > > > > Can you tell us more about how you upgraded your clusters ? > > > > Thanks > > > > On Mon, Mar 4, 2013 at 9:00 AM, Bryan Beaudreault > > <[EMAIL PROTECTED]>wrote: > > > > > We recently upgraded multiple clusters to CDH 4.2, which comes with > hbase > > > 0.94.2 Since then We've seen region servers die periodically in a way > I > > > never saw before on CDH3. > > > > > > Here are the exceptions: > > > > > > First I see a slew of these: http://pastebin.com/WqSwMzuZ > > > > > > Then the regionserver starts closing all its regions, after throwing > this > > > exception: http://pastebin.com/396wX3iw > > > > > > This has now happened on multiple servers across multiple clusters, all > > cdh > > > 4.2. > > > > > > Any thoughts? During the time our NameNodes seem to be doing fine and > I > > > don't see any issue on our datanodes either. > > > > > >
-
Re: 0.94.2 failing regionserversBryan Beaudreault 2013-03-04, 17:26
We upgraded from CDH3 which is HBase 0.90. Looking around we do have some
v2 format HFiles, just not entirely there yet. I'll try to get them all converted. On Mon, Mar 4, 2013 at 12:23 PM, Ted Yu <[EMAIL PROTECTED]> wrote: > In hindsight, this should have helped: > > http://hbase.apache.org/book.html#upgrade0.94 > > I would suggest upgrading existing HFile v1 files to v2 format. > > Cheers > > On Mon, Mar 4, 2013 at 9:20 AM, Bryan Beaudreault > <[EMAIL PROTECTED]>wrote: > > > Interesting. > > > > We upgraded by creating entirely new clusters with the CDH4.2 software > > installed onto it. Then we distributed copied the /hbase directory to the > > new clusters. We haven't run a major compaction since then, so that > could > > be the reason there are still v1 HFiles. This migration took place just > a > > couple days ago. > > > > > > On Mon, Mar 4, 2013 at 12:04 PM, Ted Yu <[EMAIL PROTECTED]> wrote: > > > > > In the stack trace, I see HFileReaderV1.java > > > I would expect your store files to be upgraded to HFile v2 after > cluster > > > upgrade. > > > > > > Can you tell us more about how you upgraded your clusters ? > > > > > > Thanks > > > > > > On Mon, Mar 4, 2013 at 9:00 AM, Bryan Beaudreault > > > <[EMAIL PROTECTED]>wrote: > > > > > > > We recently upgraded multiple clusters to CDH 4.2, which comes with > > hbase > > > > 0.94.2 Since then We've seen region servers die periodically in a > way > > I > > > > never saw before on CDH3. > > > > > > > > Here are the exceptions: > > > > > > > > First I see a slew of these: http://pastebin.com/WqSwMzuZ > > > > > > > > Then the regionserver starts closing all its regions, after throwing > > this > > > > exception: http://pastebin.com/396wX3iw > > > > > > > > This has now happened on multiple servers across multiple clusters, > all > > > cdh > > > > 4.2. > > > > > > > > Any thoughts? During the time our NameNodes seem to be doing fine > and > > I > > > > don't see any issue on our datanodes either. > > > > > > > > > >
-
Re: 0.94.2 failing regionserversJean-Daniel Cryans 2013-03-04, 17:41
This looks a lot like:
https://issues.apache.org/jira/browse/HBASE-6479 FWIW running with V1 files is ok (unless you hit this bug obviously). J-D On Mon, Mar 4, 2013 at 9:00 AM, Bryan Beaudreault <[EMAIL PROTECTED]> wrote: > We recently upgraded multiple clusters to CDH 4.2, which comes with hbase > 0.94.2 Since then We've seen region servers die periodically in a way I > never saw before on CDH3. > > Here are the exceptions: > > First I see a slew of these: http://pastebin.com/WqSwMzuZ > > Then the regionserver starts closing all its regions, after throwing this > exception: http://pastebin.com/396wX3iw > > This has now happened on multiple servers across multiple clusters, all cdh > 4.2. > > Any thoughts? During the time our NameNodes seem to be doing fine and I > don't see any issue on our datanodes either.
-
Re: 0.94.2 failing regionserversTed Yu 2013-03-04, 17:51
I logged HBASE-7991 for backport.
Cheers On Mon, Mar 4, 2013 at 9:41 AM, Jean-Daniel Cryans <[EMAIL PROTECTED]>wrote: > This looks a lot like: > > https://issues.apache.org/jira/browse/HBASE-6479 > > FWIW running with V1 files is ok (unless you hit this bug obviously). > > J-D > > On Mon, Mar 4, 2013 at 9:00 AM, Bryan Beaudreault > <[EMAIL PROTECTED]> wrote: > > We recently upgraded multiple clusters to CDH 4.2, which comes with hbase > > 0.94.2 Since then We've seen region servers die periodically in a way I > > never saw before on CDH3. > > > > Here are the exceptions: > > > > First I see a slew of these: http://pastebin.com/WqSwMzuZ > > > > Then the regionserver starts closing all its regions, after throwing this > > exception: http://pastebin.com/396wX3iw > > > > This has now happened on multiple servers across multiple clusters, all > cdh > > 4.2. > > > > Any thoughts? During the time our NameNodes seem to be doing fine and I > > don't see any issue on our datanodes either. >
-
Re: 0.94.2 failing regionserversBryan Beaudreault 2013-03-04, 17:53
Interesting, thanks for pointing that out JD. I'm running major
compactions now on the clusters to try to get everything over to V2. On Mon, Mar 4, 2013 at 12:41 PM, Jean-Daniel Cryans <[EMAIL PROTECTED]>wrote: > This looks a lot like: > > https://issues.apache.org/jira/browse/HBASE-6479 > > FWIW running with V1 files is ok (unless you hit this bug obviously). > > J-D > > On Mon, Mar 4, 2013 at 9:00 AM, Bryan Beaudreault > <[EMAIL PROTECTED]> wrote: > > We recently upgraded multiple clusters to CDH 4.2, which comes with hbase > > 0.94.2 Since then We've seen region servers die periodically in a way I > > never saw before on CDH3. > > > > Here are the exceptions: > > > > First I see a slew of these: http://pastebin.com/WqSwMzuZ > > > > Then the regionserver starts closing all its regions, after throwing this > > exception: http://pastebin.com/396wX3iw > > > > This has now happened on multiple servers across multiple clusters, all > cdh > > 4.2. > > > > Any thoughts? During the time our NameNodes seem to be doing fine and I > > don't see any issue on our datanodes either. > |