|
Jean-Daniel Cryans
2010-09-14, 00:14
Jean-Daniel Cryans
2010-09-15, 23:35
Ted Yu
2010-09-15, 23:44
Jean-Daniel Cryans
2010-09-15, 23:47
Patrick Hunt
2010-09-16, 04:31
Todd Lipcon
2010-09-17, 01:39
Stack
2010-09-17, 03:47
Jean-Daniel Cryans
2010-09-24, 23:03
Stack
2010-09-24, 23:21
Stack
2010-09-24, 23:38
Cosmin Lehene
2010-10-04, 15:33
Todd Lipcon
2010-10-04, 16:00
Jean-Daniel Cryans
2010-10-04, 17:56
Jonathan Gray
2010-10-04, 22:41
Ryan Rawson
2010-10-04, 23:56
Stack
2010-10-05, 03:38
Ryan Rawson
2010-10-05, 03:40
Stack
2010-10-05, 03:51
Ryan Rawson
2010-10-05, 03:52
Jean-Daniel Cryans
2010-10-05, 03:45
Jean-Daniel Cryans
2010-08-30, 23:39
Jean-Daniel Cryans
2010-08-31, 00:29
Stack
2010-09-03, 19:05
Todd Lipcon
2010-09-07, 01:20
Stack
2010-09-07, 04:19
Todd Lipcon
2010-09-07, 04:32
Stack
2010-09-07, 18:02
|
-
[VOTE] Release 'development release' HBase 0.89.2010830 rc2?Jean-Daniel Cryans 2010-09-14, 00:14
Second RC, new vote!
Source binary and source tar balls are available here: http://people.apache.org/~jdcryans/hbase-0.89.20100830-candidate-2/ You can also browse the candidate documentation here: http://people.apache.org/~jdcryans/hbase-0.89.20100830-candidate-2/hbase-0.89.20100830/docs/ Issues resolved since 0.89.20100726, our second 0.89.x release, are roughly ~23 issues odd including fixed deadlocks, better handling of IOEs during splits and improvements for filters: see http://su.pr/2HwiUe. 3 issues were also fixed for RC2: HBASE-2975 DFSClient names in master and RS should be unique HBASE-2967 Failed split: IOE 'File is Corrupt' -- sync length not being written out to SequenceFile HBASE-2964 Deadlock when RS tries to RPC to itself inside SplitTransaction Shall we release this candidate as the third in our 0.89.x series of developer releases? Please see previous threads on 0.89 releases for more information about the purpose of this release candidate - in particular, this 'developer release' is for those who can tolerate risk and who are willing to give feedback in advance of our next major release. We're not making any guarantees that this is bug free. Its definitely not for production deploys. We'll do another release like this in a few weeks after the new master code has gone in. Please vote by Thursday, September 16th. Thanks, J-D +
Jean-Daniel Cryans 2010-09-14, 00:14
-
Re: [VOTE] Release 'development release' HBase 0.89.2010830 rc2?Jean-Daniel Cryans 2010-09-15, 23:35
After some discussions today here at SU between Todd and the team, it
was suggested that this 0.89 release contains more of what we run in production here. One major difference is that we reverted most of HBASE-2694 since we had issues with the ZK-based assignment, didn't know exactly how many other issues lurked in there, that most of those fixes would probably not apply to the new master, and that it was generally much slower than the pre-2694 master. I also helped Vidhya with his 700 nodes today by patching 0.89.20100830 with 2694's revert, and starting his cluster became much more faster. tl;dr I propose that we sink this RC and build a new one with 2694 reverted (except for the core ZKW changes). What do the devs think? Thx, J-D On Mon, Sep 13, 2010 at 5:14 PM, Jean-Daniel Cryans <[EMAIL PROTECTED]> wrote: > Second RC, new vote! > > Source binary and source tar balls are available here: > > http://people.apache.org/~jdcryans/hbase-0.89.20100830-candidate-2/ > > You can also browse the candidate documentation here: > > http://people.apache.org/~jdcryans/hbase-0.89.20100830-candidate-2/hbase-0.89.20100830/docs/ > > Issues resolved since 0.89.20100726, our second 0.89.x release, are > roughly ~23 issues odd including fixed deadlocks, better handling of > IOEs during splits and improvements for filters: see > http://su.pr/2HwiUe. 3 issues were also fixed for RC2: > > HBASE-2975 DFSClient names in master and RS should be unique > HBASE-2967 Failed split: IOE 'File is Corrupt' -- sync length not > being written out to SequenceFile > HBASE-2964 Deadlock when RS tries to RPC to itself inside SplitTransaction > > Shall we release this candidate as the third in our 0.89.x series of > developer releases? > > Please see previous threads on 0.89 releases for more information > about the purpose of this release candidate - in particular, this > 'developer release' is for those who can tolerate risk and who are > willing to give feedback in advance of our next major release. We're > not making any guarantees that this is bug free. Its definitely not > for production deploys. > > We'll do another release like this in a few weeks after the new master > code has gone in. > > Please vote by Thursday, September 16th. > > Thanks, > > J-D > +
Jean-Daniel Cryans 2010-09-15, 23:35
-
Re: [VOTE] Release 'development release' HBase 0.89.2010830 rc2?Ted Yu 2010-09-15, 23:44
Looping in Patrick who may have insight for
https://issues.apache.org/jira/browse/HBASE-2694 On Wed, Sep 15, 2010 at 4:35 PM, Jean-Daniel Cryans <[EMAIL PROTECTED]>wrote: > After some discussions today here at SU between Todd and the team, it > was suggested that this 0.89 release contains more of what we run in > production here. One major difference is that we reverted most of > HBASE-2694 since we had issues with the ZK-based assignment, didn't > know exactly how many other issues lurked in there, that most of those > fixes would probably not apply to the new master, and that it was > generally much slower than the pre-2694 master. I also helped Vidhya > with his 700 nodes today by patching 0.89.20100830 with 2694's revert, > and starting his cluster became much more faster. > > tl;dr I propose that we sink this RC and build a new one with 2694 > reverted (except for the core ZKW changes). > > What do the devs think? > > Thx, > > J-D > > On Mon, Sep 13, 2010 at 5:14 PM, Jean-Daniel Cryans <[EMAIL PROTECTED]> > wrote: > > Second RC, new vote! > > > > Source binary and source tar balls are available here: > > > > http://people.apache.org/~jdcryans/hbase-0.89.20100830-candidate-2/<http://people.apache.org/%7Ejdcryans/hbase-0.89.20100830-candidate-2/> > > > > You can also browse the candidate documentation here: > > > > > http://people.apache.org/~jdcryans/hbase-0.89.20100830-candidate-2/hbase-0.89.20100830/docs/<http://people.apache.org/%7Ejdcryans/hbase-0.89.20100830-candidate-2/hbase-0.89.20100830/docs/> > > > > Issues resolved since 0.89.20100726, our second 0.89.x release, are > > roughly ~23 issues odd including fixed deadlocks, better handling of > > IOEs during splits and improvements for filters: see > > http://su.pr/2HwiUe. 3 issues were also fixed for RC2: > > > > HBASE-2975 DFSClient names in master and RS should be unique > > HBASE-2967 Failed split: IOE 'File is Corrupt' -- sync length not > > being written out to SequenceFile > > HBASE-2964 Deadlock when RS tries to RPC to itself inside > SplitTransaction > > > > Shall we release this candidate as the third in our 0.89.x series of > > developer releases? > > > > Please see previous threads on 0.89 releases for more information > > about the purpose of this release candidate - in particular, this > > 'developer release' is for those who can tolerate risk and who are > > willing to give feedback in advance of our next major release. We're > > not making any guarantees that this is bug free. Its definitely not > > for production deploys. > > > > We'll do another release like this in a few weeks after the new master > > code has gone in. > > > > Please vote by Thursday, September 16th. > > > > Thanks, > > > > J-D > > > +
Ted Yu 2010-09-15, 23:44
-
Re: [VOTE] Release 'development release' HBase 0.89.2010830 rc2?Jean-Daniel Cryans 2010-09-15, 23:47
Ted,
Just to be clear, the issue isn't ZK, it's us. HBASE-2694 was a stepping stone, but the master rewrite ended up in it's own branch. That stepping stone isn't needed to run HBase properly. J-D On Wed, Sep 15, 2010 at 4:44 PM, Ted Yu <[EMAIL PROTECTED]> wrote: > Looping in Patrick who may have insight for > https://issues.apache.org/jira/browse/HBASE-2694 > > On Wed, Sep 15, 2010 at 4:35 PM, Jean-Daniel Cryans <[EMAIL PROTECTED]>wrote: > >> After some discussions today here at SU between Todd and the team, it >> was suggested that this 0.89 release contains more of what we run in >> production here. One major difference is that we reverted most of >> HBASE-2694 since we had issues with the ZK-based assignment, didn't >> know exactly how many other issues lurked in there, that most of those >> fixes would probably not apply to the new master, and that it was >> generally much slower than the pre-2694 master. I also helped Vidhya >> with his 700 nodes today by patching 0.89.20100830 with 2694's revert, >> and starting his cluster became much more faster. >> >> tl;dr I propose that we sink this RC and build a new one with 2694 >> reverted (except for the core ZKW changes). >> >> What do the devs think? >> >> Thx, >> >> J-D >> >> On Mon, Sep 13, 2010 at 5:14 PM, Jean-Daniel Cryans <[EMAIL PROTECTED]> >> wrote: >> > Second RC, new vote! >> > >> > Source binary and source tar balls are available here: >> > >> > http://people.apache.org/~jdcryans/hbase-0.89.20100830-candidate-2/<http://people.apache.org/%7Ejdcryans/hbase-0.89.20100830-candidate-2/> >> > >> > You can also browse the candidate documentation here: >> > >> > >> http://people.apache.org/~jdcryans/hbase-0.89.20100830-candidate-2/hbase-0.89.20100830/docs/<http://people.apache.org/%7Ejdcryans/hbase-0.89.20100830-candidate-2/hbase-0.89.20100830/docs/> >> > >> > Issues resolved since 0.89.20100726, our second 0.89.x release, are >> > roughly ~23 issues odd including fixed deadlocks, better handling of >> > IOEs during splits and improvements for filters: see >> > http://su.pr/2HwiUe. 3 issues were also fixed for RC2: >> > >> > HBASE-2975 DFSClient names in master and RS should be unique >> > HBASE-2967 Failed split: IOE 'File is Corrupt' -- sync length not >> > being written out to SequenceFile >> > HBASE-2964 Deadlock when RS tries to RPC to itself inside >> SplitTransaction >> > >> > Shall we release this candidate as the third in our 0.89.x series of >> > developer releases? >> > >> > Please see previous threads on 0.89 releases for more information >> > about the purpose of this release candidate - in particular, this >> > 'developer release' is for those who can tolerate risk and who are >> > willing to give feedback in advance of our next major release. We're >> > not making any guarantees that this is bug free. Its definitely not >> > for production deploys. >> > >> > We'll do another release like this in a few weeks after the new master >> > code has gone in. >> > >> > Please vote by Thursday, September 16th. >> > >> > Thanks, >> > >> > J-D >> > >> > +
Jean-Daniel Cryans 2010-09-15, 23:47
-
Re: [VOTE] Release 'development release' HBase 0.89.2010830 rc2?Patrick Hunt 2010-09-16, 04:31
Sounds like this one isn't zk related, but if you run up against something
feel free to ping me. Patrick On Wed, Sep 15, 2010 at 4:47 PM, Jean-Daniel Cryans <[EMAIL PROTECTED]>wrote: > Ted, > > Just to be clear, the issue isn't ZK, it's us. HBASE-2694 was a > stepping stone, but the master rewrite ended up in it's own branch. > That stepping stone isn't needed to run HBase properly. > > J-D > > On Wed, Sep 15, 2010 at 4:44 PM, Ted Yu <[EMAIL PROTECTED]> wrote: > > Looping in Patrick who may have insight for > > https://issues.apache.org/jira/browse/HBASE-2694 > > > > On Wed, Sep 15, 2010 at 4:35 PM, Jean-Daniel Cryans <[EMAIL PROTECTED] > >wrote: > > > >> After some discussions today here at SU between Todd and the team, it > >> was suggested that this 0.89 release contains more of what we run in > >> production here. One major difference is that we reverted most of > >> HBASE-2694 since we had issues with the ZK-based assignment, didn't > >> know exactly how many other issues lurked in there, that most of those > >> fixes would probably not apply to the new master, and that it was > >> generally much slower than the pre-2694 master. I also helped Vidhya > >> with his 700 nodes today by patching 0.89.20100830 with 2694's revert, > >> and starting his cluster became much more faster. > >> > >> tl;dr I propose that we sink this RC and build a new one with 2694 > >> reverted (except for the core ZKW changes). > >> > >> What do the devs think? > >> > >> Thx, > >> > >> J-D > >> > >> On Mon, Sep 13, 2010 at 5:14 PM, Jean-Daniel Cryans < > [EMAIL PROTECTED]> > >> wrote: > >> > Second RC, new vote! > >> > > >> > Source binary and source tar balls are available here: > >> > > >> > http://people.apache.org/~jdcryans/hbase-0.89.20100830-candidate-2/< > http://people.apache.org/%7Ejdcryans/hbase-0.89.20100830-candidate-2/> > >> > > >> > You can also browse the candidate documentation here: > >> > > >> > > >> > http://people.apache.org/~jdcryans/hbase-0.89.20100830-candidate-2/hbase-0.89.20100830/docs/ > < > http://people.apache.org/%7Ejdcryans/hbase-0.89.20100830-candidate-2/hbase-0.89.20100830/docs/ > > > >> > > >> > Issues resolved since 0.89.20100726, our second 0.89.x release, are > >> > roughly ~23 issues odd including fixed deadlocks, better handling of > >> > IOEs during splits and improvements for filters: see > >> > http://su.pr/2HwiUe. 3 issues were also fixed for RC2: > >> > > >> > HBASE-2975 DFSClient names in master and RS should be unique > >> > HBASE-2967 Failed split: IOE 'File is Corrupt' -- sync length not > >> > being written out to SequenceFile > >> > HBASE-2964 Deadlock when RS tries to RPC to itself inside > >> SplitTransaction > >> > > >> > Shall we release this candidate as the third in our 0.89.x series of > >> > developer releases? > >> > > >> > Please see previous threads on 0.89 releases for more information > >> > about the purpose of this release candidate - in particular, this > >> > 'developer release' is for those who can tolerate risk and who are > >> > willing to give feedback in advance of our next major release. We're > >> > not making any guarantees that this is bug free. Its definitely not > >> > for production deploys. > >> > > >> > We'll do another release like this in a few weeks after the new master > >> > code has gone in. > >> > > >> > Please vote by Thursday, September 16th. > >> > > >> > Thanks, > >> > > >> > J-D > >> > > >> > > > +
Patrick Hunt 2010-09-16, 04:31
-
Re: [VOTE] Release 'development release' HBase 0.89.2010830 rc2?Todd Lipcon 2010-09-17, 01:39
On Wed, Sep 15, 2010 at 4:35 PM, Jean-Daniel Cryans <[EMAIL PROTECTED]>wrote:
> After some discussions today here at SU between Todd and the team, it > was suggested that this 0.89 release contains more of what we run in > production here. One major difference is that we reverted most of > HBASE-2694 since we had issues with the ZK-based assignment, didn't > know exactly how many other issues lurked in there, that most of those > fixes would probably not apply to the new master, and that it was > generally much slower than the pre-2694 master. I also helped Vidhya > with his 700 nodes today by patching 0.89.20100830 with 2694's revert, > and starting his cluster became much more faster. > > tl;dr I propose that we sink this RC and build a new one with 2694 > reverted (except for the core ZKW changes). > > What do the devs think? > > +1. I think we all anticipate that the *next* RC (including the new master) is going to be less stable initially until we've gone through some rounds of testing and fixes. So let's make this last pre-new-master release as good as possible. Releasing something that people are already running successfully in production seems like a good idea. -Todd > On Mon, Sep 13, 2010 at 5:14 PM, Jean-Daniel Cryans <[EMAIL PROTECTED]> > wrote: > > Second RC, new vote! > > > > Source binary and source tar balls are available here: > > > > http://people.apache.org/~jdcryans/hbase-0.89.20100830-candidate-2/ > > > > You can also browse the candidate documentation here: > > > > > http://people.apache.org/~jdcryans/hbase-0.89.20100830-candidate-2/hbase-0.89.20100830/docs/ > > > > Issues resolved since 0.89.20100726, our second 0.89.x release, are > > roughly ~23 issues odd including fixed deadlocks, better handling of > > IOEs during splits and improvements for filters: see > > http://su.pr/2HwiUe. 3 issues were also fixed for RC2: > > > > HBASE-2975 DFSClient names in master and RS should be unique > > HBASE-2967 Failed split: IOE 'File is Corrupt' -- sync length not > > being written out to SequenceFile > > HBASE-2964 Deadlock when RS tries to RPC to itself inside > SplitTransaction > > > > Shall we release this candidate as the third in our 0.89.x series of > > developer releases? > > > > Please see previous threads on 0.89 releases for more information > > about the purpose of this release candidate - in particular, this > > 'developer release' is for those who can tolerate risk and who are > > willing to give feedback in advance of our next major release. We're > > not making any guarantees that this is bug free. Its definitely not > > for production deploys. > > > > We'll do another release like this in a few weeks after the new master > > code has gone in. > > > > Please vote by Thursday, September 16th. > > > > Thanks, > > > > J-D > > > -- Todd Lipcon Software Engineer, Cloudera +
Todd Lipcon 2010-09-17, 01:39
-
Re: [VOTE] Release 'development release' HBase 0.89.2010830 rc2?Stack 2010-09-17, 03:47
On Wed, Sep 15, 2010 at 4:35 PM, Jean-Daniel Cryans <[EMAIL PROTECTED]> wrote:
> tl;dr I propose that we sink this RC and build a new one with 2694 > reverted (except for the core ZKW changes). > +1 Add HBASE-2986 I'd say. Thanks J-D, St.Ack +
Stack 2010-09-17, 03:47
-
[VOTE] Release 'development release' HBase 0.89.2010924 rc1?Jean-Daniel Cryans 2010-09-24, 23:03
The 0.89.20100830 DR branch was cancelled, here's the new RC off a new branch.
As discussed, this release candidate contains a revert of HBASE-2694 which means that it is back on the "very" old master. It is also very similar to what we run here in production. Sources and binaries can be found here: http://people.apache.org/~jdcryans/hbase-0.89.20100924-candidate-1/ Documentation: http://people.apache.org/~jdcryans/hbase-0.89.20100924-candidate-1/hbase-0.89.20100924/docs/ Here's the list of everything I added since moving from 0830: HBASE-3008 Memstore.updateColumnValue passes wrong flag to heapSizeChange HBASE-3035 Bandaid for HBASE-2990 HBASE-2643 Figure how to deal with eof splitting logs HBASE-2941 port HADOOP-6713 - threading scalability for RPC reads - to HBase HBASE-3006 Reading compressed HFile blocks causes way too many DFS RPC calls severly impacting performance HBASE-2989 [replication] RSM won't cleanup after locking if 0 peers HBASE-2992 [replication] MalformedObjectNameException in ReplicationMetrics HBASE-3034 Revert the regions assignment part of HBASE-2694 (and pals) for 0.89 HBASE-3033 [replication] ReplicationSink.replicateEntries improvements HBASE-2997 Performance fixes - profiler driven HBASE-2889 Tool to look at HLogs -- parse and tail -f (patch #2 only) Unfortunately I forgot to add HBASE-2986 like Stack asked (sorry, I just figured it while reading the old voting thread). Should we release this as the next "Development Release"? Please cast your vote by Wednesday, September 29th. Thanks, The HBase Team +
Jean-Daniel Cryans 2010-09-24, 23:03
-
Re: [VOTE] Release 'development release' HBase 0.89.2010924 rc1?Stack 2010-09-24, 23:21
-1 (Sorry). We need HBASE-2986. Without it client hangs if a split
in the midst of a put. Otherwise, I'd already checked it out -- doc looks good, ran under load on a cluster -- and would have +1'd it only for your calling out the absence of HBASE-2986 J-D. St.Ack On Fri, Sep 24, 2010 at 4:03 PM, Jean-Daniel Cryans <[EMAIL PROTECTED]> wrote: > The 0.89.20100830 DR branch was cancelled, here's the new RC off a new branch. > > As discussed, this release candidate contains a revert of HBASE-2694 > which means that it is back on the "very" old master. It is also very > similar to what we run here in production. > > Sources and binaries can be found here: > > http://people.apache.org/~jdcryans/hbase-0.89.20100924-candidate-1/ > > Documentation: > > http://people.apache.org/~jdcryans/hbase-0.89.20100924-candidate-1/hbase-0.89.20100924/docs/ > > Here's the list of everything I added since moving from 0830: > > HBASE-3008 Memstore.updateColumnValue passes wrong flag to heapSizeChange > HBASE-3035 Bandaid for HBASE-2990 > HBASE-2643 Figure how to deal with eof splitting logs > HBASE-2941 port HADOOP-6713 - threading scalability for RPC reads - to HBase > HBASE-3006 Reading compressed HFile blocks causes way too many DFS RPC calls > severly impacting performance > HBASE-2989 [replication] RSM won't cleanup after locking if 0 peers > HBASE-2992 [replication] MalformedObjectNameException in ReplicationMetrics > HBASE-3034 Revert the regions assignment part of HBASE-2694 (and > pals) for 0.89 > HBASE-3033 [replication] ReplicationSink.replicateEntries improvements > HBASE-2997 Performance fixes - profiler driven > HBASE-2889 Tool to look at HLogs -- parse and tail -f (patch #2 only) > > Unfortunately I forgot to add HBASE-2986 like Stack asked (sorry, I > just figured it while reading the old voting thread). > > Should we release this as the next "Development Release"? Please cast > your vote by Wednesday, September 29th. > > Thanks, > > The HBase Team > +
Stack 2010-09-24, 23:21
-
Re: [VOTE] Release 'development release' HBase 0.89.2010924 rc1?Stack 2010-09-24, 23:38
Changing my vote after J-D and I did a bit of digging. HBASE-2986 is
a fix for HBASE-1845, the issue that added multi*; i.e. multiget, multidelete, etc. This 0.89 was cut from the branch before hbase-1845 was applied. +1 St.Ack On Fri, Sep 24, 2010 at 4:21 PM, Stack <[EMAIL PROTECTED]> wrote: > -1 (Sorry). We need HBASE-2986. Without it client hangs if a split > in the midst of a put. > > Otherwise, I'd already checked it out -- doc looks good, ran under > load on a cluster -- and would have +1'd it only for your calling out > the absence of HBASE-2986 J-D. > > St.Ack > > > On Fri, Sep 24, 2010 at 4:03 PM, Jean-Daniel Cryans <[EMAIL PROTECTED]> wrote: >> The 0.89.20100830 DR branch was cancelled, here's the new RC off a new branch. >> >> As discussed, this release candidate contains a revert of HBASE-2694 >> which means that it is back on the "very" old master. It is also very >> similar to what we run here in production. >> >> Sources and binaries can be found here: >> >> http://people.apache.org/~jdcryans/hbase-0.89.20100924-candidate-1/ >> >> Documentation: >> >> http://people.apache.org/~jdcryans/hbase-0.89.20100924-candidate-1/hbase-0.89.20100924/docs/ >> >> Here's the list of everything I added since moving from 0830: >> >> HBASE-3008 Memstore.updateColumnValue passes wrong flag to heapSizeChange >> HBASE-3035 Bandaid for HBASE-2990 >> HBASE-2643 Figure how to deal with eof splitting logs >> HBASE-2941 port HADOOP-6713 - threading scalability for RPC reads - to HBase >> HBASE-3006 Reading compressed HFile blocks causes way too many DFS RPC calls >> severly impacting performance >> HBASE-2989 [replication] RSM won't cleanup after locking if 0 peers >> HBASE-2992 [replication] MalformedObjectNameException in ReplicationMetrics >> HBASE-3034 Revert the regions assignment part of HBASE-2694 (and >> pals) for 0.89 >> HBASE-3033 [replication] ReplicationSink.replicateEntries improvements >> HBASE-2997 Performance fixes - profiler driven >> HBASE-2889 Tool to look at HLogs -- parse and tail -f (patch #2 only) >> >> Unfortunately I forgot to add HBASE-2986 like Stack asked (sorry, I >> just figured it while reading the old voting thread). >> >> Should we release this as the next "Development Release"? Please cast >> your vote by Wednesday, September 29th. >> >> Thanks, >> >> The HBase Team >> > +
Stack 2010-09-24, 23:38
-
Re: [VOTE] Release 'development release' HBase 0.89.2010924 rc1?Cosmin Lehene 2010-10-04, 15:33
+1
I tested the build over the weekend running MR jobs and some write performance testing. It looks good. Cosmin On Sep 25, 2010, at 2:38 AM, Stack wrote: > Changing my vote after J-D and I did a bit of digging. HBASE-2986 is > a fix for HBASE-1845, the issue that added multi*; i.e. multiget, > multidelete, etc. This 0.89 was cut from the branch before hbase-1845 > was applied. > > +1 > > St.Ack > > > On Fri, Sep 24, 2010 at 4:21 PM, Stack <[EMAIL PROTECTED]> wrote: >> -1 (Sorry). We need HBASE-2986. Without it client hangs if a split >> in the midst of a put. >> >> Otherwise, I'd already checked it out -- doc looks good, ran under >> load on a cluster -- and would have +1'd it only for your calling out >> the absence of HBASE-2986 J-D. >> >> St.Ack >> >> >> On Fri, Sep 24, 2010 at 4:03 PM, Jean-Daniel Cryans <[EMAIL PROTECTED]> wrote: >>> The 0.89.20100830 DR branch was cancelled, here's the new RC off a new branch. >>> >>> As discussed, this release candidate contains a revert of HBASE-2694 >>> which means that it is back on the "very" old master. It is also very >>> similar to what we run here in production. >>> >>> Sources and binaries can be found here: >>> >>> http://people.apache.org/~jdcryans/hbase-0.89.20100924-candidate-1/ >>> >>> Documentation: >>> >>> http://people.apache.org/~jdcryans/hbase-0.89.20100924-candidate-1/hbase-0.89.20100924/docs/ >>> >>> Here's the list of everything I added since moving from 0830: >>> >>> HBASE-3008 Memstore.updateColumnValue passes wrong flag to heapSizeChange >>> HBASE-3035 Bandaid for HBASE-2990 >>> HBASE-2643 Figure how to deal with eof splitting logs >>> HBASE-2941 port HADOOP-6713 - threading scalability for RPC reads - to HBase >>> HBASE-3006 Reading compressed HFile blocks causes way too many DFS RPC calls >>> severly impacting performance >>> HBASE-2989 [replication] RSM won't cleanup after locking if 0 peers >>> HBASE-2992 [replication] MalformedObjectNameException in ReplicationMetrics >>> HBASE-3034 Revert the regions assignment part of HBASE-2694 (and >>> pals) for 0.89 >>> HBASE-3033 [replication] ReplicationSink.replicateEntries improvements >>> HBASE-2997 Performance fixes - profiler driven >>> HBASE-2889 Tool to look at HLogs -- parse and tail -f (patch #2 only) >>> >>> Unfortunately I forgot to add HBASE-2986 like Stack asked (sorry, I >>> just figured it while reading the old voting thread). >>> >>> Should we release this as the next "Development Release"? Please cast >>> your vote by Wednesday, September 29th. >>> >>> Thanks, >>> >>> The HBase Team >>> >> +
Cosmin Lehene 2010-10-04, 15:33
-
Re: [VOTE] Release 'development release' HBase 0.89.2010924 rc1?Todd Lipcon 2010-10-04, 16:00
+0 from me - I did some brief testing but not quite enough for a full +1.
Was testing on a new cluster and I had forgotten to set up ulimit, so it exploded after a few hours. Will upgrade to a +1 later this week if I have time to run a longer test. -Todd On Mon, Oct 4, 2010 at 8:33 AM, Cosmin Lehene <[EMAIL PROTECTED]> wrote: > +1 > I tested the build over the weekend running MR jobs and some write > performance testing. It looks good. > > Cosmin > > On Sep 25, 2010, at 2:38 AM, Stack wrote: > > > Changing my vote after J-D and I did a bit of digging. HBASE-2986 is > > a fix for HBASE-1845, the issue that added multi*; i.e. multiget, > > multidelete, etc. This 0.89 was cut from the branch before hbase-1845 > > was applied. > > > > +1 > > > > St.Ack > > > > > > On Fri, Sep 24, 2010 at 4:21 PM, Stack <[EMAIL PROTECTED]> wrote: > >> -1 (Sorry). We need HBASE-2986. Without it client hangs if a split > >> in the midst of a put. > >> > >> Otherwise, I'd already checked it out -- doc looks good, ran under > >> load on a cluster -- and would have +1'd it only for your calling out > >> the absence of HBASE-2986 J-D. > >> > >> St.Ack > >> > >> > >> On Fri, Sep 24, 2010 at 4:03 PM, Jean-Daniel Cryans < > [EMAIL PROTECTED]> wrote: > >>> The 0.89.20100830 DR branch was cancelled, here's the new RC off a new > branch. > >>> > >>> As discussed, this release candidate contains a revert of HBASE-2694 > >>> which means that it is back on the "very" old master. It is also very > >>> similar to what we run here in production. > >>> > >>> Sources and binaries can be found here: > >>> > >>> http://people.apache.org/~jdcryans/hbase-0.89.20100924-candidate-1/ > >>> > >>> Documentation: > >>> > >>> > http://people.apache.org/~jdcryans/hbase-0.89.20100924-candidate-1/hbase-0.89.20100924/docs/ > >>> > >>> Here's the list of everything I added since moving from 0830: > >>> > >>> HBASE-3008 Memstore.updateColumnValue passes wrong flag to > heapSizeChange > >>> HBASE-3035 Bandaid for HBASE-2990 > >>> HBASE-2643 Figure how to deal with eof splitting logs > >>> HBASE-2941 port HADOOP-6713 - threading scalability for RPC reads - > to HBase > >>> HBASE-3006 Reading compressed HFile blocks causes way too many DFS > RPC calls > >>> severly impacting performance > >>> HBASE-2989 [replication] RSM won't cleanup after locking if 0 peers > >>> HBASE-2992 [replication] MalformedObjectNameException in > ReplicationMetrics > >>> HBASE-3034 Revert the regions assignment part of HBASE-2694 (and > >>> pals) for 0.89 > >>> HBASE-3033 [replication] ReplicationSink.replicateEntries > improvements > >>> HBASE-2997 Performance fixes - profiler driven > >>> HBASE-2889 Tool to look at HLogs -- parse and tail -f (patch #2 only) > >>> > >>> Unfortunately I forgot to add HBASE-2986 like Stack asked (sorry, I > >>> just figured it while reading the old voting thread). > >>> > >>> Should we release this as the next "Development Release"? Please cast > >>> your vote by Wednesday, September 29th. > >>> > >>> Thanks, > >>> > >>> The HBase Team > >>> > >> > > -- Todd Lipcon Software Engineer, Cloudera +
Todd Lipcon 2010-10-04, 16:00
-
Re: [VOTE] Release 'development release' HBase 0.89.2010924 rc1?Jean-Daniel Cryans 2010-10-04, 17:56
My vote is obviously +1, although we hit a bug this weekend regarding
HBASE-3008 (for which we'll post a patch soon). Over time, the memstore size of regions with ICVs grows negative, which means that those regions can't flush and when you close them you basically lose all the data since the last flush (since on close it won't flush either). We solved this by disabling ICVs to those tables (basically setting the async ICV queues in the thrift servers to -1), copied the data to another cluster, restarted the cluster with the fix, re-imported the data, then re-enabled the ICVs. I don't think this is a blocker for a DR, as it only affects users doing only tons of ICVs on particular tables, but it should be disclosed somewhere. J-D On Fri, Sep 24, 2010 at 4:03 PM, Jean-Daniel Cryans <[EMAIL PROTECTED]> wrote: > The 0.89.20100830 DR branch was cancelled, here's the new RC off a new branch. > > As discussed, this release candidate contains a revert of HBASE-2694 > which means that it is back on the "very" old master. It is also very > similar to what we run here in production. > > Sources and binaries can be found here: > > http://people.apache.org/~jdcryans/hbase-0.89.20100924-candidate-1/ > > Documentation: > > http://people.apache.org/~jdcryans/hbase-0.89.20100924-candidate-1/hbase-0.89.20100924/docs/ > > Here's the list of everything I added since moving from 0830: > > HBASE-3008 Memstore.updateColumnValue passes wrong flag to heapSizeChange > HBASE-3035 Bandaid for HBASE-2990 > HBASE-2643 Figure how to deal with eof splitting logs > HBASE-2941 port HADOOP-6713 - threading scalability for RPC reads - to HBase > HBASE-3006 Reading compressed HFile blocks causes way too many DFS RPC calls > severly impacting performance > HBASE-2989 [replication] RSM won't cleanup after locking if 0 peers > HBASE-2992 [replication] MalformedObjectNameException in ReplicationMetrics > HBASE-3034 Revert the regions assignment part of HBASE-2694 (and > pals) for 0.89 > HBASE-3033 [replication] ReplicationSink.replicateEntries improvements > HBASE-2997 Performance fixes - profiler driven > HBASE-2889 Tool to look at HLogs -- parse and tail -f (patch #2 only) > > Unfortunately I forgot to add HBASE-2986 like Stack asked (sorry, I > just figured it while reading the old voting thread). > > Should we release this as the next "Development Release"? Please cast > your vote by Wednesday, September 29th. > > Thanks, > > The HBase Team > +
Jean-Daniel Cryans 2010-10-04, 17:56
-
RE: [VOTE] Release 'development release' HBase 0.89.2010924 rc1?Jonathan Gray 2010-10-04, 22:41
+1
I took it for a test drive today and tested all the basic stuff. No performance stuff but I think enough for my vote. JG > -----Original Message----- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Jean- > Daniel Cryans > Sent: Monday, October 04, 2010 10:56 AM > To: [EMAIL PROTECTED] > Subject: Re: [VOTE] Release 'development release' HBase 0.89.2010924 > rc1? > > My vote is obviously +1, although we hit a bug this weekend regarding > HBASE-3008 (for which we'll post a patch soon). Over time, the > memstore size of regions with ICVs grows negative, which means that > those regions can't flush and when you close them you basically lose > all the data since the last flush (since on close it won't flush > either). We solved this by disabling ICVs to those tables (basically > setting the async ICV queues in the thrift servers to -1), copied the > data to another cluster, restarted the cluster with the fix, > re-imported the data, then re-enabled the ICVs. > > I don't think this is a blocker for a DR, as it only affects users > doing only tons of ICVs on particular tables, but it should be > disclosed somewhere. > > J-D > > On Fri, Sep 24, 2010 at 4:03 PM, Jean-Daniel Cryans > <[EMAIL PROTECTED]> wrote: > > The 0.89.20100830 DR branch was cancelled, here's the new RC off a > new branch. > > > > As discussed, this release candidate contains a revert of HBASE-2694 > > which means that it is back on the "very" old master. It is also very > > similar to what we run here in production. > > > > Sources and binaries can be found here: > > > > http://people.apache.org/~jdcryans/hbase-0.89.20100924-candidate-1/ > > > > Documentation: > > > > http://people.apache.org/~jdcryans/hbase-0.89.20100924-candidate- > 1/hbase-0.89.20100924/docs/ > > > > Here's the list of everything I added since moving from 0830: > > > > HBASE-3008 Memstore.updateColumnValue passes wrong flag to > heapSizeChange > > HBASE-3035 Bandaid for HBASE-2990 > > HBASE-2643 Figure how to deal with eof splitting logs > > HBASE-2941 port HADOOP-6713 - threading scalability for RPC reads - > to HBase > > HBASE-3006 Reading compressed HFile blocks causes way too many DFS > RPC calls > > severly impacting performance > > HBASE-2989 [replication] RSM won't cleanup after locking if 0 peers > > HBASE-2992 [replication] MalformedObjectNameException in > ReplicationMetrics > > HBASE-3034 Revert the regions assignment part of HBASE-2694 (and > > pals) for 0.89 > > HBASE-3033 [replication] ReplicationSink.replicateEntries > improvements > > HBASE-2997 Performance fixes - profiler driven > > HBASE-2889 Tool to look at HLogs -- parse and tail -f (patch #2 > only) > > > > Unfortunately I forgot to add HBASE-2986 like Stack asked (sorry, I > > just figured it while reading the old voting thread). > > > > Should we release this as the next "Development Release"? Please cast > > your vote by Wednesday, September 29th. > > > > Thanks, > > > > The HBase Team > > +
Jonathan Gray 2010-10-04, 22:41
-
Re: [VOTE] Release 'development release' HBase 0.89.2010924 rc1?Ryan Rawson 2010-10-04, 23:56
I ran ycsb on it for a while and it looked ok... but we really cant
ship without the fix to that bug, it has the possibility of causing serious data loss for heavy users of ICV. -ryan On Mon, Oct 4, 2010 at 3:41 PM, Jonathan Gray <[EMAIL PROTECTED]> wrote: > +1 > > I took it for a test drive today and tested all the basic stuff. No performance stuff but I think enough for my vote. > > JG > >> -----Original Message----- >> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Jean- >> Daniel Cryans >> Sent: Monday, October 04, 2010 10:56 AM >> To: [EMAIL PROTECTED] >> Subject: Re: [VOTE] Release 'development release' HBase 0.89.2010924 >> rc1? >> >> My vote is obviously +1, although we hit a bug this weekend regarding >> HBASE-3008 (for which we'll post a patch soon). Over time, the >> memstore size of regions with ICVs grows negative, which means that >> those regions can't flush and when you close them you basically lose >> all the data since the last flush (since on close it won't flush >> either). We solved this by disabling ICVs to those tables (basically >> setting the async ICV queues in the thrift servers to -1), copied the >> data to another cluster, restarted the cluster with the fix, >> re-imported the data, then re-enabled the ICVs. >> >> I don't think this is a blocker for a DR, as it only affects users >> doing only tons of ICVs on particular tables, but it should be >> disclosed somewhere. >> >> J-D >> >> On Fri, Sep 24, 2010 at 4:03 PM, Jean-Daniel Cryans >> <[EMAIL PROTECTED]> wrote: >> > The 0.89.20100830 DR branch was cancelled, here's the new RC off a >> new branch. >> > >> > As discussed, this release candidate contains a revert of HBASE-2694 >> > which means that it is back on the "very" old master. It is also very >> > similar to what we run here in production. >> > >> > Sources and binaries can be found here: >> > >> > http://people.apache.org/~jdcryans/hbase-0.89.20100924-candidate-1/ >> > >> > Documentation: >> > >> > http://people.apache.org/~jdcryans/hbase-0.89.20100924-candidate- >> 1/hbase-0.89.20100924/docs/ >> > >> > Here's the list of everything I added since moving from 0830: >> > >> > HBASE-3008 Memstore.updateColumnValue passes wrong flag to >> heapSizeChange >> > HBASE-3035 Bandaid for HBASE-2990 >> > HBASE-2643 Figure how to deal with eof splitting logs >> > HBASE-2941 port HADOOP-6713 - threading scalability for RPC reads - >> to HBase >> > HBASE-3006 Reading compressed HFile blocks causes way too many DFS >> RPC calls >> > severly impacting performance >> > HBASE-2989 [replication] RSM won't cleanup after locking if 0 peers >> > HBASE-2992 [replication] MalformedObjectNameException in >> ReplicationMetrics >> > HBASE-3034 Revert the regions assignment part of HBASE-2694 (and >> > pals) for 0.89 >> > HBASE-3033 [replication] ReplicationSink.replicateEntries >> improvements >> > HBASE-2997 Performance fixes - profiler driven >> > HBASE-2889 Tool to look at HLogs -- parse and tail -f (patch #2 >> only) >> > >> > Unfortunately I forgot to add HBASE-2986 like Stack asked (sorry, I >> > just figured it while reading the old voting thread). >> > >> > Should we release this as the next "Development Release"? Please cast >> > your vote by Wednesday, September 29th. >> > >> > Thanks, >> > >> > The HBase Team >> > > +
Ryan Rawson 2010-10-04, 23:56
-
Re: [VOTE] Release 'development release' HBase 0.89.2010924 rc1?Stack 2010-10-05, 03:38
On Mon, Oct 4, 2010 at 4:56 PM, Ryan Rawson <[EMAIL PROTECTED]> wrote:
> I ran ycsb on it for a while and it looked ok... but we really cant > ship without the fix to that bug, it has the possibility of causing > serious data loss for heavy users of ICV. > We can ship the DR though, right? 0.90.0RC1 is just around the corner! St.Ack > -ryan > > On Mon, Oct 4, 2010 at 3:41 PM, Jonathan Gray <[EMAIL PROTECTED]> wrote: >> +1 >> >> I took it for a test drive today and tested all the basic stuff. No performance stuff but I think enough for my vote. >> >> JG >> >>> -----Original Message----- >>> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Jean- >>> Daniel Cryans >>> Sent: Monday, October 04, 2010 10:56 AM >>> To: [EMAIL PROTECTED] >>> Subject: Re: [VOTE] Release 'development release' HBase 0.89.2010924 >>> rc1? >>> >>> My vote is obviously +1, although we hit a bug this weekend regarding >>> HBASE-3008 (for which we'll post a patch soon). Over time, the >>> memstore size of regions with ICVs grows negative, which means that >>> those regions can't flush and when you close them you basically lose >>> all the data since the last flush (since on close it won't flush >>> either). We solved this by disabling ICVs to those tables (basically >>> setting the async ICV queues in the thrift servers to -1), copied the >>> data to another cluster, restarted the cluster with the fix, >>> re-imported the data, then re-enabled the ICVs. >>> >>> I don't think this is a blocker for a DR, as it only affects users >>> doing only tons of ICVs on particular tables, but it should be >>> disclosed somewhere. >>> >>> J-D >>> >>> On Fri, Sep 24, 2010 at 4:03 PM, Jean-Daniel Cryans >>> <[EMAIL PROTECTED]> wrote: >>> > The 0.89.20100830 DR branch was cancelled, here's the new RC off a >>> new branch. >>> > >>> > As discussed, this release candidate contains a revert of HBASE-2694 >>> > which means that it is back on the "very" old master. It is also very >>> > similar to what we run here in production. >>> > >>> > Sources and binaries can be found here: >>> > >>> > http://people.apache.org/~jdcryans/hbase-0.89.20100924-candidate-1/ >>> > >>> > Documentation: >>> > >>> > http://people.apache.org/~jdcryans/hbase-0.89.20100924-candidate- >>> 1/hbase-0.89.20100924/docs/ >>> > >>> > Here's the list of everything I added since moving from 0830: >>> > >>> > HBASE-3008 Memstore.updateColumnValue passes wrong flag to >>> heapSizeChange >>> > HBASE-3035 Bandaid for HBASE-2990 >>> > HBASE-2643 Figure how to deal with eof splitting logs >>> > HBASE-2941 port HADOOP-6713 - threading scalability for RPC reads - >>> to HBase >>> > HBASE-3006 Reading compressed HFile blocks causes way too many DFS >>> RPC calls >>> > severly impacting performance >>> > HBASE-2989 [replication] RSM won't cleanup after locking if 0 peers >>> > HBASE-2992 [replication] MalformedObjectNameException in >>> ReplicationMetrics >>> > HBASE-3034 Revert the regions assignment part of HBASE-2694 (and >>> > pals) for 0.89 >>> > HBASE-3033 [replication] ReplicationSink.replicateEntries >>> improvements >>> > HBASE-2997 Performance fixes - profiler driven >>> > HBASE-2889 Tool to look at HLogs -- parse and tail -f (patch #2 >>> only) >>> > >>> > Unfortunately I forgot to add HBASE-2986 like Stack asked (sorry, I >>> > just figured it while reading the old voting thread). >>> > >>> > Should we release this as the next "Development Release"? Please cast >>> > your vote by Wednesday, September 29th. >>> > >>> > Thanks, >>> > >>> > The HBase Team >>> > >> > +
Stack 2010-10-05, 03:38
-
Re: [VOTE] Release 'development release' HBase 0.89.2010924 rc1?Ryan Rawson 2010-10-05, 03:40
we could yes. with the caveat that no production use/data loss ahoy.
-ryan On Mon, Oct 4, 2010 at 8:38 PM, Stack <[EMAIL PROTECTED]> wrote: > On Mon, Oct 4, 2010 at 4:56 PM, Ryan Rawson <[EMAIL PROTECTED]> wrote: >> I ran ycsb on it for a while and it looked ok... but we really cant >> ship without the fix to that bug, it has the possibility of causing >> serious data loss for heavy users of ICV. >> > > We can ship the DR though, right? 0.90.0RC1 is just around the corner! > St.Ack > > >> -ryan >> >> On Mon, Oct 4, 2010 at 3:41 PM, Jonathan Gray <[EMAIL PROTECTED]> wrote: >>> +1 >>> >>> I took it for a test drive today and tested all the basic stuff. No performance stuff but I think enough for my vote. >>> >>> JG >>> >>>> -----Original Message----- >>>> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Jean- >>>> Daniel Cryans >>>> Sent: Monday, October 04, 2010 10:56 AM >>>> To: [EMAIL PROTECTED] >>>> Subject: Re: [VOTE] Release 'development release' HBase 0.89.2010924 >>>> rc1? >>>> >>>> My vote is obviously +1, although we hit a bug this weekend regarding >>>> HBASE-3008 (for which we'll post a patch soon). Over time, the >>>> memstore size of regions with ICVs grows negative, which means that >>>> those regions can't flush and when you close them you basically lose >>>> all the data since the last flush (since on close it won't flush >>>> either). We solved this by disabling ICVs to those tables (basically >>>> setting the async ICV queues in the thrift servers to -1), copied the >>>> data to another cluster, restarted the cluster with the fix, >>>> re-imported the data, then re-enabled the ICVs. >>>> >>>> I don't think this is a blocker for a DR, as it only affects users >>>> doing only tons of ICVs on particular tables, but it should be >>>> disclosed somewhere. >>>> >>>> J-D >>>> >>>> On Fri, Sep 24, 2010 at 4:03 PM, Jean-Daniel Cryans >>>> <[EMAIL PROTECTED]> wrote: >>>> > The 0.89.20100830 DR branch was cancelled, here's the new RC off a >>>> new branch. >>>> > >>>> > As discussed, this release candidate contains a revert of HBASE-2694 >>>> > which means that it is back on the "very" old master. It is also very >>>> > similar to what we run here in production. >>>> > >>>> > Sources and binaries can be found here: >>>> > >>>> > http://people.apache.org/~jdcryans/hbase-0.89.20100924-candidate-1/ >>>> > >>>> > Documentation: >>>> > >>>> > http://people.apache.org/~jdcryans/hbase-0.89.20100924-candidate- >>>> 1/hbase-0.89.20100924/docs/ >>>> > >>>> > Here's the list of everything I added since moving from 0830: >>>> > >>>> > HBASE-3008 Memstore.updateColumnValue passes wrong flag to >>>> heapSizeChange >>>> > HBASE-3035 Bandaid for HBASE-2990 >>>> > HBASE-2643 Figure how to deal with eof splitting logs >>>> > HBASE-2941 port HADOOP-6713 - threading scalability for RPC reads - >>>> to HBase >>>> > HBASE-3006 Reading compressed HFile blocks causes way too many DFS >>>> RPC calls >>>> > severly impacting performance >>>> > HBASE-2989 [replication] RSM won't cleanup after locking if 0 peers >>>> > HBASE-2992 [replication] MalformedObjectNameException in >>>> ReplicationMetrics >>>> > HBASE-3034 Revert the regions assignment part of HBASE-2694 (and >>>> > pals) for 0.89 >>>> > HBASE-3033 [replication] ReplicationSink.replicateEntries >>>> improvements >>>> > HBASE-2997 Performance fixes - profiler driven >>>> > HBASE-2889 Tool to look at HLogs -- parse and tail -f (patch #2 >>>> only) >>>> > >>>> > Unfortunately I forgot to add HBASE-2986 like Stack asked (sorry, I >>>> > just figured it while reading the old voting thread). >>>> > >>>> > Should we release this as the next "Development Release"? Please cast >>>> > your vote by Wednesday, September 29th. >>>> > >>>> > Thanks, >>>> > >>>> > The HBase Team >>>> > >>> >> > +
Ryan Rawson 2010-10-05, 03:40
-
Re: [VOTE] Release 'development release' HBase 0.89.2010924 rc1?Stack 2010-10-05, 03:51
Sure. That caveat about no warranty, do not use in "production", is
on there already. And the bug is in ICVs only, right? We can release w/ warning that ICVers need to apply the patch, np. Good stuff, St.Ack On Mon, Oct 4, 2010 at 8:40 PM, Ryan Rawson <[EMAIL PROTECTED]> wrote: > we could yes. with the caveat that no production use/data loss ahoy. > > -ryan > > On Mon, Oct 4, 2010 at 8:38 PM, Stack <[EMAIL PROTECTED]> wrote: >> On Mon, Oct 4, 2010 at 4:56 PM, Ryan Rawson <[EMAIL PROTECTED]> wrote: >>> I ran ycsb on it for a while and it looked ok... but we really cant >>> ship without the fix to that bug, it has the possibility of causing >>> serious data loss for heavy users of ICV. >>> >> >> We can ship the DR though, right? 0.90.0RC1 is just around the corner! >> St.Ack >> >> >>> -ryan >>> >>> On Mon, Oct 4, 2010 at 3:41 PM, Jonathan Gray <[EMAIL PROTECTED]> wrote: >>>> +1 >>>> >>>> I took it for a test drive today and tested all the basic stuff. No performance stuff but I think enough for my vote. >>>> >>>> JG >>>> >>>>> -----Original Message----- >>>>> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Jean- >>>>> Daniel Cryans >>>>> Sent: Monday, October 04, 2010 10:56 AM >>>>> To: [EMAIL PROTECTED] >>>>> Subject: Re: [VOTE] Release 'development release' HBase 0.89.2010924 >>>>> rc1? >>>>> >>>>> My vote is obviously +1, although we hit a bug this weekend regarding >>>>> HBASE-3008 (for which we'll post a patch soon). Over time, the >>>>> memstore size of regions with ICVs grows negative, which means that >>>>> those regions can't flush and when you close them you basically lose >>>>> all the data since the last flush (since on close it won't flush >>>>> either). We solved this by disabling ICVs to those tables (basically >>>>> setting the async ICV queues in the thrift servers to -1), copied the >>>>> data to another cluster, restarted the cluster with the fix, >>>>> re-imported the data, then re-enabled the ICVs. >>>>> >>>>> I don't think this is a blocker for a DR, as it only affects users >>>>> doing only tons of ICVs on particular tables, but it should be >>>>> disclosed somewhere. >>>>> >>>>> J-D >>>>> >>>>> On Fri, Sep 24, 2010 at 4:03 PM, Jean-Daniel Cryans >>>>> <[EMAIL PROTECTED]> wrote: >>>>> > The 0.89.20100830 DR branch was cancelled, here's the new RC off a >>>>> new branch. >>>>> > >>>>> > As discussed, this release candidate contains a revert of HBASE-2694 >>>>> > which means that it is back on the "very" old master. It is also very >>>>> > similar to what we run here in production. >>>>> > >>>>> > Sources and binaries can be found here: >>>>> > >>>>> > http://people.apache.org/~jdcryans/hbase-0.89.20100924-candidate-1/ >>>>> > >>>>> > Documentation: >>>>> > >>>>> > http://people.apache.org/~jdcryans/hbase-0.89.20100924-candidate- >>>>> 1/hbase-0.89.20100924/docs/ >>>>> > >>>>> > Here's the list of everything I added since moving from 0830: >>>>> > >>>>> > HBASE-3008 Memstore.updateColumnValue passes wrong flag to >>>>> heapSizeChange >>>>> > HBASE-3035 Bandaid for HBASE-2990 >>>>> > HBASE-2643 Figure how to deal with eof splitting logs >>>>> > HBASE-2941 port HADOOP-6713 - threading scalability for RPC reads - >>>>> to HBase >>>>> > HBASE-3006 Reading compressed HFile blocks causes way too many DFS >>>>> RPC calls >>>>> > severly impacting performance >>>>> > HBASE-2989 [replication] RSM won't cleanup after locking if 0 peers >>>>> > HBASE-2992 [replication] MalformedObjectNameException in >>>>> ReplicationMetrics >>>>> > HBASE-3034 Revert the regions assignment part of HBASE-2694 (and >>>>> > pals) for 0.89 >>>>> > HBASE-3033 [replication] ReplicationSink.replicateEntries >>>>> improvements >>>>> > HBASE-2997 Performance fixes - profiler driven >>>>> > HBASE-2889 Tool to look at HLogs -- parse and tail -f (patch #2 >>>>> only) >>>>> > >>>>> > Unfortunately I forgot to add HBASE-2986 like Stack asked (sorry, I >>>>> > just figured it while reading the old voting thread). +
Stack 2010-10-05, 03:51
-
Re: [VOTE] Release 'development release' HBase 0.89.2010924 rc1?Ryan Rawson 2010-10-05, 03:52
yes it is ICV only, and most prevalent on tables that are heavily/only icv.
you can always kill -9 the RS to force log recovery and all will be well. assuming you can take the outage :-) On Mon, Oct 4, 2010 at 8:51 PM, Stack <[EMAIL PROTECTED]> wrote: > Sure. That caveat about no warranty, do not use in "production", is > on there already. And the bug is in ICVs only, right? We can release > w/ warning that ICVers need to apply the patch, np. > > Good stuff, > St.Ack > > > On Mon, Oct 4, 2010 at 8:40 PM, Ryan Rawson <[EMAIL PROTECTED]> wrote: >> we could yes. with the caveat that no production use/data loss ahoy. >> >> -ryan >> >> On Mon, Oct 4, 2010 at 8:38 PM, Stack <[EMAIL PROTECTED]> wrote: >>> On Mon, Oct 4, 2010 at 4:56 PM, Ryan Rawson <[EMAIL PROTECTED]> wrote: >>>> I ran ycsb on it for a while and it looked ok... but we really cant >>>> ship without the fix to that bug, it has the possibility of causing >>>> serious data loss for heavy users of ICV. >>>> >>> >>> We can ship the DR though, right? 0.90.0RC1 is just around the corner! >>> St.Ack >>> >>> >>>> -ryan >>>> >>>> On Mon, Oct 4, 2010 at 3:41 PM, Jonathan Gray <[EMAIL PROTECTED]> wrote: >>>>> +1 >>>>> >>>>> I took it for a test drive today and tested all the basic stuff. No performance stuff but I think enough for my vote. >>>>> >>>>> JG >>>>> >>>>>> -----Original Message----- >>>>>> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Jean- >>>>>> Daniel Cryans >>>>>> Sent: Monday, October 04, 2010 10:56 AM >>>>>> To: [EMAIL PROTECTED] >>>>>> Subject: Re: [VOTE] Release 'development release' HBase 0.89.2010924 >>>>>> rc1? >>>>>> >>>>>> My vote is obviously +1, although we hit a bug this weekend regarding >>>>>> HBASE-3008 (for which we'll post a patch soon). Over time, the >>>>>> memstore size of regions with ICVs grows negative, which means that >>>>>> those regions can't flush and when you close them you basically lose >>>>>> all the data since the last flush (since on close it won't flush >>>>>> either). We solved this by disabling ICVs to those tables (basically >>>>>> setting the async ICV queues in the thrift servers to -1), copied the >>>>>> data to another cluster, restarted the cluster with the fix, >>>>>> re-imported the data, then re-enabled the ICVs. >>>>>> >>>>>> I don't think this is a blocker for a DR, as it only affects users >>>>>> doing only tons of ICVs on particular tables, but it should be >>>>>> disclosed somewhere. >>>>>> >>>>>> J-D >>>>>> >>>>>> On Fri, Sep 24, 2010 at 4:03 PM, Jean-Daniel Cryans >>>>>> <[EMAIL PROTECTED]> wrote: >>>>>> > The 0.89.20100830 DR branch was cancelled, here's the new RC off a >>>>>> new branch. >>>>>> > >>>>>> > As discussed, this release candidate contains a revert of HBASE-2694 >>>>>> > which means that it is back on the "very" old master. It is also very >>>>>> > similar to what we run here in production. >>>>>> > >>>>>> > Sources and binaries can be found here: >>>>>> > >>>>>> > http://people.apache.org/~jdcryans/hbase-0.89.20100924-candidate-1/ >>>>>> > >>>>>> > Documentation: >>>>>> > >>>>>> > http://people.apache.org/~jdcryans/hbase-0.89.20100924-candidate- >>>>>> 1/hbase-0.89.20100924/docs/ >>>>>> > >>>>>> > Here's the list of everything I added since moving from 0830: >>>>>> > >>>>>> > HBASE-3008 Memstore.updateColumnValue passes wrong flag to >>>>>> heapSizeChange >>>>>> > HBASE-3035 Bandaid for HBASE-2990 >>>>>> > HBASE-2643 Figure how to deal with eof splitting logs >>>>>> > HBASE-2941 port HADOOP-6713 - threading scalability for RPC reads - >>>>>> to HBase >>>>>> > HBASE-3006 Reading compressed HFile blocks causes way too many DFS >>>>>> RPC calls >>>>>> > severly impacting performance >>>>>> > HBASE-2989 [replication] RSM won't cleanup after locking if 0 peers >>>>>> > HBASE-2992 [replication] MalformedObjectNameException in >>>>>> ReplicationMetrics >>>>>> > HBASE-3034 Revert the regions assignment part of HBASE-2694 (and >>>>>> > pals) for 0.89 +
Ryan Rawson 2010-10-05, 03:52
-
Re: [VOTE] Release 'development release' HBase 0.89.2010924 rc1?Jean-Daniel Cryans 2010-10-05, 03:45
We already have 3 binding +1s, so the vote has passed.
J-D On Mon, Oct 4, 2010 at 8:38 PM, Stack <[EMAIL PROTECTED]> wrote: > On Mon, Oct 4, 2010 at 4:56 PM, Ryan Rawson <[EMAIL PROTECTED]> wrote: >> I ran ycsb on it for a while and it looked ok... but we really cant >> ship without the fix to that bug, it has the possibility of causing >> serious data loss for heavy users of ICV. >> > > We can ship the DR though, right? 0.90.0RC1 is just around the corner! > St.Ack > > >> -ryan >> >> On Mon, Oct 4, 2010 at 3:41 PM, Jonathan Gray <[EMAIL PROTECTED]> wrote: >>> +1 >>> >>> I took it for a test drive today and tested all the basic stuff. No performance stuff but I think enough for my vote. >>> >>> JG >>> >>>> -----Original Message----- >>>> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Jean- >>>> Daniel Cryans >>>> Sent: Monday, October 04, 2010 10:56 AM >>>> To: [EMAIL PROTECTED] >>>> Subject: Re: [VOTE] Release 'development release' HBase 0.89.2010924 >>>> rc1? >>>> >>>> My vote is obviously +1, although we hit a bug this weekend regarding >>>> HBASE-3008 (for which we'll post a patch soon). Over time, the >>>> memstore size of regions with ICVs grows negative, which means that >>>> those regions can't flush and when you close them you basically lose >>>> all the data since the last flush (since on close it won't flush >>>> either). We solved this by disabling ICVs to those tables (basically >>>> setting the async ICV queues in the thrift servers to -1), copied the >>>> data to another cluster, restarted the cluster with the fix, >>>> re-imported the data, then re-enabled the ICVs. >>>> >>>> I don't think this is a blocker for a DR, as it only affects users >>>> doing only tons of ICVs on particular tables, but it should be >>>> disclosed somewhere. >>>> >>>> J-D >>>> >>>> On Fri, Sep 24, 2010 at 4:03 PM, Jean-Daniel Cryans >>>> <[EMAIL PROTECTED]> wrote: >>>> > The 0.89.20100830 DR branch was cancelled, here's the new RC off a >>>> new branch. >>>> > >>>> > As discussed, this release candidate contains a revert of HBASE-2694 >>>> > which means that it is back on the "very" old master. It is also very >>>> > similar to what we run here in production. >>>> > >>>> > Sources and binaries can be found here: >>>> > >>>> > http://people.apache.org/~jdcryans/hbase-0.89.20100924-candidate-1/ >>>> > >>>> > Documentation: >>>> > >>>> > http://people.apache.org/~jdcryans/hbase-0.89.20100924-candidate- >>>> 1/hbase-0.89.20100924/docs/ >>>> > >>>> > Here's the list of everything I added since moving from 0830: >>>> > >>>> > HBASE-3008 Memstore.updateColumnValue passes wrong flag to >>>> heapSizeChange >>>> > HBASE-3035 Bandaid for HBASE-2990 >>>> > HBASE-2643 Figure how to deal with eof splitting logs >>>> > HBASE-2941 port HADOOP-6713 - threading scalability for RPC reads - >>>> to HBase >>>> > HBASE-3006 Reading compressed HFile blocks causes way too many DFS >>>> RPC calls >>>> > severly impacting performance >>>> > HBASE-2989 [replication] RSM won't cleanup after locking if 0 peers >>>> > HBASE-2992 [replication] MalformedObjectNameException in >>>> ReplicationMetrics >>>> > HBASE-3034 Revert the regions assignment part of HBASE-2694 (and >>>> > pals) for 0.89 >>>> > HBASE-3033 [replication] ReplicationSink.replicateEntries >>>> improvements >>>> > HBASE-2997 Performance fixes - profiler driven >>>> > HBASE-2889 Tool to look at HLogs -- parse and tail -f (patch #2 >>>> only) >>>> > >>>> > Unfortunately I forgot to add HBASE-2986 like Stack asked (sorry, I >>>> > just figured it while reading the old voting thread). >>>> > >>>> > Should we release this as the next "Development Release"? Please cast >>>> > your vote by Wednesday, September 29th. >>>> > >>>> > Thanks, >>>> > >>>> > The HBase Team >>>> > >>> >> > +
Jean-Daniel Cryans 2010-10-05, 03:45
-
[VOTE] Release 'development release' HBase 0.89.2010830 rc1?Jean-Daniel Cryans 2010-08-30, 23:39
It's time for another DR since the new master code is about to be
merged in and we have a few fixes and improvements that'd beneficiate from more exposure. I branched from trunk this morning and created a new tag. (See http://wiki.apache.org/hadoop/Hbase/HBaseVersions for more on what these 0.89.x releases are all about) Source binary and source tar balls are available here: http://people.apache.org/~jdcryans/hbase-0.89.20100830-candidate-1/ You can also browse the candidate documentation here: http://people.apache.org/~jdcryans/hbase-0.89.20100830-candidate-1/hbase-0.89.20100830/docs/ Issues resolved since 0.89.20100726, our second 0.89.x release, are roughly ~23 issues odd including fixed deadlocks, better handling of IOEs during splits and improvements for filters: see http://su.pr/2HwiUe Shall we release this candidate as the third in our 0.89.x series of developer releases? Please see previous threads on 0.89 releases for more information about the purpose of this release candidate - in particular, this 'developer release' is for those who can tolerate risk and who are willing to give feedback in advance of our next major release. We're not making any guarantees that this is bug free. Its definitely not for production deploys. We'll do another release like this in a few weeks after the new master code has gone in. Please vote by Monday, September 6th. Thanks, J-D +
Jean-Daniel Cryans 2010-08-30, 23:39
-
Re: [VOTE] Release 'development release' HBase 0.89.2010830 rc1?Jean-Daniel Cryans 2010-08-31, 00:29
BTW I'm +1, I ran the unit tests, did cluster testing with YCSB, and
deployed it in almost all our environments here (this will be completed before the voting period ends). J-D On Mon, Aug 30, 2010 at 4:39 PM, Jean-Daniel Cryans <[EMAIL PROTECTED]> wrote: > It's time for another DR since the new master code is about to be > merged in and we have a few fixes and improvements that'd beneficiate > from more exposure. I branched from trunk this morning and created a > new tag. (See http://wiki.apache.org/hadoop/Hbase/HBaseVersions for > more on what these 0.89.x releases are all about) > > Source binary and source tar balls are available here: > > http://people.apache.org/~jdcryans/hbase-0.89.20100830-candidate-1/ > > You can also browse the candidate documentation here: > > http://people.apache.org/~jdcryans/hbase-0.89.20100830-candidate-1/hbase-0.89.20100830/docs/ > > Issues resolved since 0.89.20100726, our second 0.89.x release, are > roughly ~23 issues odd including fixed deadlocks, better handling of > IOEs during splits and improvements for filters: see > http://su.pr/2HwiUe > > Shall we release this candidate as the third in our 0.89.x series of > developer releases? > > Please see previous threads on 0.89 releases for more information > about the purpose of this release candidate - in particular, this > 'developer release' is for those who can tolerate risk and who are > willing to give feedback in advance of our next major release. We're > not making any guarantees that this is bug free. Its definitely not > for production deploys. > > We'll do another release like this in a few weeks after the new master > code has gone in. > > Please vote by Monday, September 6th. > > Thanks, > J-D > +
Jean-Daniel Cryans 2010-08-31, 00:29
-
Re: [VOTE] Release 'development release' HBase 0.89.2010830 rc1?Stack 2010-09-03, 19:05
+1
Its running in production at SU. St.Ack On Mon, Aug 30, 2010 at 5:29 PM, Jean-Daniel Cryans <[EMAIL PROTECTED]> wrote: > BTW I'm +1, I ran the unit tests, did cluster testing with YCSB, and > deployed it in almost all our environments here (this will be > completed before the voting period ends). > > J-D > > On Mon, Aug 30, 2010 at 4:39 PM, Jean-Daniel Cryans <[EMAIL PROTECTED]> wrote: >> It's time for another DR since the new master code is about to be >> merged in and we have a few fixes and improvements that'd beneficiate >> from more exposure. I branched from trunk this morning and created a >> new tag. (See http://wiki.apache.org/hadoop/Hbase/HBaseVersions for >> more on what these 0.89.x releases are all about) >> >> Source binary and source tar balls are available here: >> >> http://people.apache.org/~jdcryans/hbase-0.89.20100830-candidate-1/ >> >> You can also browse the candidate documentation here: >> >> http://people.apache.org/~jdcryans/hbase-0.89.20100830-candidate-1/hbase-0.89.20100830/docs/ >> >> Issues resolved since 0.89.20100726, our second 0.89.x release, are >> roughly ~23 issues odd including fixed deadlocks, better handling of >> IOEs during splits and improvements for filters: see >> http://su.pr/2HwiUe >> >> Shall we release this candidate as the third in our 0.89.x series of >> developer releases? >> >> Please see previous threads on 0.89 releases for more information >> about the purpose of this release candidate - in particular, this >> 'developer release' is for those who can tolerate risk and who are >> willing to give feedback in advance of our next major release. We're >> not making any guarantees that this is bug free. Its definitely not >> for production deploys. >> >> We'll do another release like this in a few weeks after the new master >> code has gone in. >> >> Please vote by Monday, September 6th. >> >> Thanks, >> J-D >> > +
Stack 2010-09-03, 19:05
-
Re: [VOTE] Release 'development release' HBase 0.89.2010830 rc1?Todd Lipcon 2010-09-07, 01:20
I did some load tests on this afternoon and ran into this bug:
https://issues.apache.org/jira/browse/HBASE-2964 I got this after loading only 44GB (with 1G region size), so I don't think it's that rare. I'm also running with something like 30 handlers. My tests are with 80 concurrent clients. So consider me a -0 - if no one else runs into this bug we may as well release, but it seems a step backward in stability for me from the 20100621 (the last one I did significant load testing on) -Todd On Fri, Sep 3, 2010 at 12:05 PM, Stack <[EMAIL PROTECTED]> wrote: > +1 > > Its running in production at SU. > > St.Ack > > On Mon, Aug 30, 2010 at 5:29 PM, Jean-Daniel Cryans <[EMAIL PROTECTED]> > wrote: > > BTW I'm +1, I ran the unit tests, did cluster testing with YCSB, and > > deployed it in almost all our environments here (this will be > > completed before the voting period ends). > > > > J-D > > > > On Mon, Aug 30, 2010 at 4:39 PM, Jean-Daniel Cryans <[EMAIL PROTECTED]> > wrote: > >> It's time for another DR since the new master code is about to be > >> merged in and we have a few fixes and improvements that'd beneficiate > >> from more exposure. I branched from trunk this morning and created a > >> new tag. (See http://wiki.apache.org/hadoop/Hbase/HBaseVersions for > >> more on what these 0.89.x releases are all about) > >> > >> Source binary and source tar balls are available here: > >> > >> http://people.apache.org/~jdcryans/hbase-0.89.20100830-candidate-1/ > >> > >> You can also browse the candidate documentation here: > >> > >> > http://people.apache.org/~jdcryans/hbase-0.89.20100830-candidate-1/hbase-0.89.20100830/docs/ > >> > >> Issues resolved since 0.89.20100726, our second 0.89.x release, are > >> roughly ~23 issues odd including fixed deadlocks, better handling of > >> IOEs during splits and improvements for filters: see > >> http://su.pr/2HwiUe > >> > >> Shall we release this candidate as the third in our 0.89.x series of > >> developer releases? > >> > >> Please see previous threads on 0.89 releases for more information > >> about the purpose of this release candidate - in particular, this > >> 'developer release' is for those who can tolerate risk and who are > >> willing to give feedback in advance of our next major release. We're > >> not making any guarantees that this is bug free. Its definitely not > >> for production deploys. > >> > >> We'll do another release like this in a few weeks after the new master > >> code has gone in. > >> > >> Please vote by Monday, September 6th. > >> > >> Thanks, > >> J-D > >> > > > -- Todd Lipcon Software Engineer, Cloudera +
Todd Lipcon 2010-09-07, 01:20
-
Re: [VOTE] Release 'development release' HBase 0.89.2010830 rc1?Stack 2010-09-07, 04:19
Nice one Todd. Its a little like HBASE-2880. The issue you found
looks like it has a few guises. I also think it an issue we've always had, not a regression. Things are different since new master commit last week -- there are multiple executors per operation type, open, close, etc., with root and meta having their own instantiations -- but you need a handler before you get an executor. The case where all handlers could end up occupied, in this case 'blocked' on a particular region lock, and no progress can be made because an edit needs to be made back into the same server before we can progress looks like it can still happen, even in the new regime. I agree we should just keep moving forward. Lets fix in the next 0.89.x. Good stuff, St.Ack On Mon, Sep 6, 2010 at 6:20 PM, Todd Lipcon <[EMAIL PROTECTED]> wrote: > I did some load tests on this afternoon and ran into this bug: > > https://issues.apache.org/jira/browse/HBASE-2964 > > I got this after loading only 44GB (with 1G region size), so I don't think > it's that rare. I'm also running with something like 30 handlers. My tests > are with 80 concurrent clients. > > So consider me a -0 - if no one else runs into this bug we may as well > release, but it seems a step backward in stability for me from the 20100621 > (the last one I did significant load testing on) > > -Todd > > On Fri, Sep 3, 2010 at 12:05 PM, Stack <[EMAIL PROTECTED]> wrote: > >> +1 >> >> Its running in production at SU. >> >> St.Ack >> >> On Mon, Aug 30, 2010 at 5:29 PM, Jean-Daniel Cryans <[EMAIL PROTECTED]> >> wrote: >> > BTW I'm +1, I ran the unit tests, did cluster testing with YCSB, and >> > deployed it in almost all our environments here (this will be >> > completed before the voting period ends). >> > >> > J-D >> > >> > On Mon, Aug 30, 2010 at 4:39 PM, Jean-Daniel Cryans <[EMAIL PROTECTED]> >> wrote: >> >> It's time for another DR since the new master code is about to be >> >> merged in and we have a few fixes and improvements that'd beneficiate >> >> from more exposure. I branched from trunk this morning and created a >> >> new tag. (See http://wiki.apache.org/hadoop/Hbase/HBaseVersions for >> >> more on what these 0.89.x releases are all about) >> >> >> >> Source binary and source tar balls are available here: >> >> >> >> http://people.apache.org/~jdcryans/hbase-0.89.20100830-candidate-1/ >> >> >> >> You can also browse the candidate documentation here: >> >> >> >> >> http://people.apache.org/~jdcryans/hbase-0.89.20100830-candidate-1/hbase-0.89.20100830/docs/ >> >> >> >> Issues resolved since 0.89.20100726, our second 0.89.x release, are >> >> roughly ~23 issues odd including fixed deadlocks, better handling of >> >> IOEs during splits and improvements for filters: see >> >> http://su.pr/2HwiUe >> >> >> >> Shall we release this candidate as the third in our 0.89.x series of >> >> developer releases? >> >> >> >> Please see previous threads on 0.89 releases for more information >> >> about the purpose of this release candidate - in particular, this >> >> 'developer release' is for those who can tolerate risk and who are >> >> willing to give feedback in advance of our next major release. We're >> >> not making any guarantees that this is bug free. Its definitely not >> >> for production deploys. >> >> >> >> We'll do another release like this in a few weeks after the new master >> >> code has gone in. >> >> >> >> Please vote by Monday, September 6th. >> >> >> >> Thanks, >> >> J-D >> >> >> > >> > > > > -- > Todd Lipcon > Software Engineer, Cloudera > +
Stack 2010-09-07, 04:19
-
Re: [VOTE] Release 'development release' HBase 0.89.2010830 rc1?Todd Lipcon 2010-09-07, 04:32
On Mon, Sep 6, 2010 at 9:19 PM, Stack <[EMAIL PROTECTED]> wrote:
> Nice one Todd. Its a little like HBASE-2880. The issue you found > looks like it has a few guises. I also think it an issue we've always > had, not a regression. > I don't think so. I killed the server that hit the issue, and within a couple minutes it happened on another server. I've never seen this before and this is the same sort of test that I've run on every other release. Looking at the diff of HBASE-2461, I think the difference is that we used to close the region and then do the writes into META for the new info. Since the region was already marked closed by the time we were writing, we wouldn't have handlers stuck on the lock. Post-HBASE-2461, the META writes happen under the lock before the region gets marked closed. > > Things are different since new master commit last week -- there are > multiple executors per operation type, open, close, etc., with root > and meta having their own instantiations -- but you need a handler > before you get an executor. The case where all handlers could end up > occupied, in this case 'blocked' on a particular region lock, and no > progress can be made because an edit needs to be made back into the > same server before we can progress looks like it can still happen, > even in the new regime. > > I agree we should just keep moving forward. Lets fix in the next 0.89.x. > Are other people not seeing this issue under a load test? I got it within an hour using YCSB with 1G split sizes - I imagine if you tune it down to a stress-test config with lots of splits, you'll see it even faster. -Todd > On Mon, Sep 6, 2010 at 6:20 PM, Todd Lipcon <[EMAIL PROTECTED]> wrote: > > I did some load tests on this afternoon and ran into this bug: > > > > https://issues.apache.org/jira/browse/HBASE-2964 > > > > I got this after loading only 44GB (with 1G region size), so I don't > think > > it's that rare. I'm also running with something like 30 handlers. My > tests > > are with 80 concurrent clients. > > > > So consider me a -0 - if no one else runs into this bug we may as well > > release, but it seems a step backward in stability for me from the > 20100621 > > (the last one I did significant load testing on) > > > > -Todd > > > > On Fri, Sep 3, 2010 at 12:05 PM, Stack <[EMAIL PROTECTED]> wrote: > > > >> +1 > >> > >> Its running in production at SU. > >> > >> St.Ack > >> > >> On Mon, Aug 30, 2010 at 5:29 PM, Jean-Daniel Cryans < > [EMAIL PROTECTED]> > >> wrote: > >> > BTW I'm +1, I ran the unit tests, did cluster testing with YCSB, and > >> > deployed it in almost all our environments here (this will be > >> > completed before the voting period ends). > >> > > >> > J-D > >> > > >> > On Mon, Aug 30, 2010 at 4:39 PM, Jean-Daniel Cryans < > [EMAIL PROTECTED]> > >> wrote: > >> >> It's time for another DR since the new master code is about to be > >> >> merged in and we have a few fixes and improvements that'd beneficiate > >> >> from more exposure. I branched from trunk this morning and created a > >> >> new tag. (See http://wiki.apache.org/hadoop/Hbase/HBaseVersions for > >> >> more on what these 0.89.x releases are all about) > >> >> > >> >> Source binary and source tar balls are available here: > >> >> > >> >> http://people.apache.org/~jdcryans/hbase-0.89.20100830-candidate-1/ > >> >> > >> >> You can also browse the candidate documentation here: > >> >> > >> >> > >> > http://people.apache.org/~jdcryans/hbase-0.89.20100830-candidate-1/hbase-0.89.20100830/docs/ > >> >> > >> >> Issues resolved since 0.89.20100726, our second 0.89.x release, are > >> >> roughly ~23 issues odd including fixed deadlocks, better handling of > >> >> IOEs during splits and improvements for filters: see > >> >> http://su.pr/2HwiUe > >> >> > >> >> Shall we release this candidate as the third in our 0.89.x series of > >> >> developer releases? > >> >> > >> >> Please see previous threads on 0.89 releases for more information > >> >> about the purpose of this release candidate - in particular, this Todd Lipcon Software Engineer, Cloudera +
Todd Lipcon 2010-09-07, 04:32
-
Re: [VOTE] Release 'development release' HBase 0.89.2010830 rc1?Stack 2010-09-07, 18:02
On Mon, Sep 6, 2010 at 9:32 PM, Todd Lipcon <[EMAIL PROTECTED]> wrote:
> On Mon, Sep 6, 2010 at 9:19 PM, Stack <[EMAIL PROTECTED]> wrote: > Looking at the diff of HBASE-2461, I think the difference is that we used to > close the region and then do the writes into META for the new info. Since > the region was already marked closed by the time we were writing, we > wouldn't have handlers stuck on the lock. Post-HBASE-2461, the META writes > happen under the lock before the region gets marked closed. > This would explain it. New axiom: "Never hold a lock while trying to talk to another server" Lets do new RC (as suggested up on IRC). St.Ack +
Stack 2010-09-07, 18:02
|