|
|
-
Region splitting problem
Ben Kim 2012-06-28, 07:07
Hi :)
I have a hbase table with rowkeys "1" ~ "10000000" with non-meaningful cells (each row has one cell that's about 10KB) I figured that all data was in one region, so I ran a command to split row at "1000000" then, hbase automatically started splitting the regions which should have happened before.(this is ok with me )
bigger problem is that for some regions start key is larger than end key. following is the region info after all auto-split and the hbase became steady. if you look at the 4th region and 2nd to the last, start key is lareger than the endkey. How is this possible?
NameRegion ServerStart KeyEnd KeyRequests testtable,,1340843246470.03a98c01a47d448966e299112915c021. hbase-20:60030 10000000 0 testtable,10000000,1340847028444.d85e18bb7597d8bde91b2819b45a1020. hbase-20:60030 10000000 15624273 0 testtable,15624273,1340847028444.2b3ec263d7810f883b3c0087f059ff15. hbase-20:60030 15624273 21249208 0 testtable,21249208,1340846924325.e137e808cb7a4d3e52298f17cc5ae1b9. hbase-20:60030 21249208 2687348 0 testtable,2687348,1340846924325.fc012ac943efd484e4bbe4bacc81c51a. hbase-20:60030 2687348 3249908 0 testtable,3249908,1340846817015.0aeddf5bf1cdb073c9b2a0ace5bd709f. hbase-20:60030 3249908 3812335 0 testtable,3812335,1340846817015.142b3619706dbb36d0338b2a587fd719. hbase-20:60030 3812335 43748950 0 testtable,43748950,1340846705466.dd18534fabb5b06ec345424d4b174176. hbase-20:60030 43748950 49373886 0 testtable,49373886,1340846705466.db6abcdeb0525c02b5def7fde147fd76. hbase-20:60030 49373886 54999486 0 testtable,54999486,1340846592668.1907f5f6d601720d429f7cac824a82cb. hbase-20:60030 54999486 60623755 0 testtable,60623755,1340846592668.1ee94c1f78e7e4dd661d5db4a55d49ed. hbase-20:60030 60623755 66248691 0 testtable,66248691,1340846484950.6651e3964a3d55f39f36cd268af666a5. hbase-20:60030 66248691 71873625 0 testtable,71873625,1340846484950.3a7e621d9c866a4e85816de36c18a6e2. hbase-20:60030 71873625 77499226 0 testtable,77499226,1340846372518.1ce1f0cb47004bac1f3812e768be7749. hbase-20:60030 77499226 83123496 0 testtable,83123496,1340846372518.0e76f8ff58b3c3a6c87c6e8116213fa6. hbase-20:60030 83123496 88749097 0 testtable,88749097,1340846264242.fbfb1ecd4dc57d8a77c2457cd91c64c5. hbase-20:60030 88749097 9437403 0 testtable,9437403,1340846264242.b92bf9b988416b1cb55bdf297654c2a5. hbase-20:60030 9437403 0
*Benjamin Kim* *benkimkimben at gmail*
-
RE: Region splitting problem
Ramkrishna.S.Vasudevan 2012-06-28, 08:33
Hi
What type of rowkeys are you specifying? HBase does Byte comparison.
So the split that happens is correct.
"1012" and "10112" will fall in the same region whereas "201" will come in the next region.
It depends on how you form the row key. Regards Ram
> -----Original Message----- > From: Ben Kim [mailto:[EMAIL PROTECTED]] > Sent: Thursday, June 28, 2012 12:38 PM > To: [EMAIL PROTECTED] > Subject: Region splitting problem > > Hi :) > > I have a hbase table with rowkeys "1" ~ "10000000" with non-meaningful > cells (each row has one cell that's about 10KB) > I figured that all data was in one region, so I ran a command to split > row > at "1000000" > then, hbase automatically started splitting the regions which should > have > happened before.(this is ok with me ) > > bigger problem is that for some regions start key is larger than end > key. > following is the region info after all auto-split and the hbase became > steady. > if you look at the 4th region and 2nd to the last, start key is lareger > than the endkey. > How is this possible? > > NameRegion ServerStart KeyEnd KeyRequests > testtable,,1340843246470.03a98c01a47d448966e299112915c021. hbase- > 20:60030 > 10000000 0 > testtable,10000000,1340847028444.d85e18bb7597d8bde91b2819b45a1020. > hbase-20:60030 > 10000000 15624273 0 > testtable,15624273,1340847028444.2b3ec263d7810f883b3c0087f059ff15. > hbase-20:60030 > 15624273 21249208 0 > testtable,21249208,1340846924325.e137e808cb7a4d3e52298f17cc5ae1b9. > hbase-20:60030 > 21249208 2687348 0 > testtable,2687348,1340846924325.fc012ac943efd484e4bbe4bacc81c51a. > hbase-20:60030 > 2687348 3249908 0 > testtable,3249908,1340846817015.0aeddf5bf1cdb073c9b2a0ace5bd709f. > hbase-20:60030 > 3249908 3812335 0 > testtable,3812335,1340846817015.142b3619706dbb36d0338b2a587fd719. > hbase-20:60030 > 3812335 43748950 0 > testtable,43748950,1340846705466.dd18534fabb5b06ec345424d4b174176. > hbase-20:60030 > 43748950 49373886 0 > testtable,49373886,1340846705466.db6abcdeb0525c02b5def7fde147fd76. > hbase-20:60030 > 49373886 54999486 0 > testtable,54999486,1340846592668.1907f5f6d601720d429f7cac824a82cb. > hbase-20:60030 > 54999486 60623755 0 > testtable,60623755,1340846592668.1ee94c1f78e7e4dd661d5db4a55d49ed. > hbase-20:60030 > 60623755 66248691 0 > testtable,66248691,1340846484950.6651e3964a3d55f39f36cd268af666a5. > hbase-20:60030 > 66248691 71873625 0 > testtable,71873625,1340846484950.3a7e621d9c866a4e85816de36c18a6e2. > hbase-20:60030 > 71873625 77499226 0 > testtable,77499226,1340846372518.1ce1f0cb47004bac1f3812e768be7749. > hbase-20:60030 > 77499226 83123496 0 > testtable,83123496,1340846372518.0e76f8ff58b3c3a6c87c6e8116213fa6. > hbase-20:60030 > 83123496 88749097 0 > testtable,88749097,1340846264242.fbfb1ecd4dc57d8a77c2457cd91c64c5. > hbase-20:60030 > 88749097 9437403 0 > testtable,9437403,1340846264242.b92bf9b988416b1cb55bdf297654c2a5. > hbase-20:60030 > 9437403 > 0 > > *Benjamin Kim* > *benkimkimben at gmail*
-
Re: Region splitting problem
Ben Kim 2012-06-28, 11:44
You are so right. why didn't I think about that :'(
I appreciate a lot for your comment. Ben
On Thu, Jun 28, 2012 at 5:33 PM, Ramkrishna.S.Vasudevan < [EMAIL PROTECTED]> wrote:
> Hi > > What type of rowkeys are you specifying? HBase does Byte comparison. > > So the split that happens is correct. > > "1012" and "10112" will fall in the same region whereas "201" will come in > the next region. > > It depends on how you form the row key. > > > Regards > Ram > > > -----Original Message----- > > From: Ben Kim [mailto:[EMAIL PROTECTED]] > > Sent: Thursday, June 28, 2012 12:38 PM > > To: [EMAIL PROTECTED] > > Subject: Region splitting problem > > > > Hi :) > > > > I have a hbase table with rowkeys "1" ~ "10000000" with non-meaningful > > cells (each row has one cell that's about 10KB) > > I figured that all data was in one region, so I ran a command to split > > row > > at "1000000" > > then, hbase automatically started splitting the regions which should > > have > > happened before.(this is ok with me ) > > > > bigger problem is that for some regions start key is larger than end > > key. > > following is the region info after all auto-split and the hbase became > > steady. > > if you look at the 4th region and 2nd to the last, start key is lareger > > than the endkey. > > How is this possible? > > > > NameRegion ServerStart KeyEnd KeyRequests > > testtable,,1340843246470.03a98c01a47d448966e299112915c021. hbase- > > 20:60030 > > 10000000 0 > > testtable,10000000,1340847028444.d85e18bb7597d8bde91b2819b45a1020. > > hbase-20:60030 > > 10000000 15624273 0 > > testtable,15624273,1340847028444.2b3ec263d7810f883b3c0087f059ff15. > > hbase-20:60030 > > 15624273 21249208 0 > > testtable,21249208,1340846924325.e137e808cb7a4d3e52298f17cc5ae1b9. > > hbase-20:60030 > > 21249208 2687348 0 > > testtable,2687348,1340846924325.fc012ac943efd484e4bbe4bacc81c51a. > > hbase-20:60030 > > 2687348 3249908 0 > > testtable,3249908,1340846817015.0aeddf5bf1cdb073c9b2a0ace5bd709f. > > hbase-20:60030 > > 3249908 3812335 0 > > testtable,3812335,1340846817015.142b3619706dbb36d0338b2a587fd719. > > hbase-20:60030 > > 3812335 43748950 0 > > testtable,43748950,1340846705466.dd18534fabb5b06ec345424d4b174176. > > hbase-20:60030 > > 43748950 49373886 0 > > testtable,49373886,1340846705466.db6abcdeb0525c02b5def7fde147fd76. > > hbase-20:60030 > > 49373886 54999486 0 > > testtable,54999486,1340846592668.1907f5f6d601720d429f7cac824a82cb. > > hbase-20:60030 > > 54999486 60623755 0 > > testtable,60623755,1340846592668.1ee94c1f78e7e4dd661d5db4a55d49ed. > > hbase-20:60030 > > 60623755 66248691 0 > > testtable,66248691,1340846484950.6651e3964a3d55f39f36cd268af666a5. > > hbase-20:60030 > > 66248691 71873625 0 > > testtable,71873625,1340846484950.3a7e621d9c866a4e85816de36c18a6e2. > > hbase-20:60030 > > 71873625 77499226 0 > > testtable,77499226,1340846372518.1ce1f0cb47004bac1f3812e768be7749. > > hbase-20:60030 > > 77499226 83123496 0 > > testtable,83123496,1340846372518.0e76f8ff58b3c3a6c87c6e8116213fa6. > > hbase-20:60030 > > 83123496 88749097 0 > > testtable,88749097,1340846264242.fbfb1ecd4dc57d8a77c2457cd91c64c5. > > hbase-20:60030 > > 88749097 9437403 0 > > testtable,9437403,1340846264242.b92bf9b988416b1cb55bdf297654c2a5. > > hbase-20:60030 > > 9437403 > > 0 > > > > *Benjamin Kim* > > *benkimkimben at gmail* > > --
*Benjamin Kim* *benkimkimben at gmail*
-
Re: Region splitting problem
Jonathan Bishop 2012-06-28, 15:04
Ram,
Your key splitting is incorrect - I had the same problem. Give this a try...notice that you need to insert a zero before the first byte to avoid BigInteger from interpreting this as a negative number (is uses the first bit as a sign bit, and that you need to strip of the leading zero when converting back to bytes, for a similar reason.
Jon
public static BigInteger getBigInteger(byte[] byteArray) { byte[] b = new byte[1 + byteArray.length]; b[0] = 0; Bytes.putBytes(b, 1, byteArray, 0, byteArray.length); return new BigInteger(b); }
public static byte[][] getHexSplits(byte[] startKey, byte[] endKey, int numRegions) throws IOException { if (startKey.length != endKey.length) { throw new IOException("start/end key lengths not equal"); } int keyLength = startKey.length; byte[][] splits = new byte[numRegions - 1][]; BigInteger lowestKey = getBigInteger(startKey); BigInteger highestKey = getBigInteger(endKey); BigInteger range = highestKey.subtract(lowestKey); BigInteger regionIncrement = range.divide(BigInteger.valueOf(numRegions)); lowestKey = lowestKey.add(regionIncrement); for (int i = 0; i < numRegions - 1; i++) { BigInteger key lowestKey.add(regionIncrement.multiply(BigInteger.valueOf(i))); byte[] s = key.toByteArray(); if (s.length < keyLength) { throw new IOException("computed key length to small"); } splits[i] = new byte[keyLength]; Bytes.putBytes(splits[i], 0, s, s.length - keyLength, keyLength); if (i > 0 && Bytes.compareTo(splits[i - 1], splits[i]) >= 0) { throw new IOException("node hex splits are out of order"); } } return splits; } On Thu, Jun 28, 2012 at 1:33 AM, Ramkrishna.S.Vasudevan < [EMAIL PROTECTED]> wrote:
> Hi > > What type of rowkeys are you specifying? HBase does Byte comparison. > > So the split that happens is correct. > > "1012" and "10112" will fall in the same region whereas "201" will come in > the next region. > > It depends on how you form the row key. > > > Regards > Ram > > > -----Original Message----- > > From: Ben Kim [mailto:[EMAIL PROTECTED]] > > Sent: Thursday, June 28, 2012 12:38 PM > > To: [EMAIL PROTECTED] > > Subject: Region splitting problem > > > > Hi :) > > > > I have a hbase table with rowkeys "1" ~ "10000000" with non-meaningful > > cells (each row has one cell that's about 10KB) > > I figured that all data was in one region, so I ran a command to split > > row > > at "1000000" > > then, hbase automatically started splitting the regions which should > > have > > happened before.(this is ok with me ) > > > > bigger problem is that for some regions start key is larger than end > > key. > > following is the region info after all auto-split and the hbase became > > steady. > > if you look at the 4th region and 2nd to the last, start key is lareger > > than the endkey. > > How is this possible? > > > > NameRegion ServerStart KeyEnd KeyRequests > > testtable,,1340843246470.03a98c01a47d448966e299112915c021. hbase- > > 20:60030 > > 10000000 0 > > testtable,10000000,1340847028444.d85e18bb7597d8bde91b2819b45a1020. > > hbase-20:60030 > > 10000000 15624273 0 > > testtable,15624273,1340847028444.2b3ec263d7810f883b3c0087f059ff15. > > hbase-20:60030 > > 15624273 21249208 0 > > testtable,21249208,1340846924325.e137e808cb7a4d3e52298f17cc5ae1b9. > > hbase-20:60030 > > 21249208 2687348 0 > > testtable,2687348,1340846924325.fc012ac943efd484e4bbe4bacc81c51a. > > hbase-20:60030 > > 2687348 3249908 0 > > testtable,3249908,1340846817015.0aeddf5bf1cdb073c9b2a0ace5bd709f. > > hbase-20:60030 > > 3249908 3812335 0 > > testtable,3812335,1340846817015.142b3619706dbb36d0338b2a587fd719. > > hbase-20:60030 > > 3812335 43748950 0 > > testtable,43748950,1340846705466.dd18534fabb5b06ec345424d4b174176. > > hbase-20:60030 > > 43748950 49373886 0 > > testtable,49373886,1340846705466.db6abcdeb0525c02b5def7fde147fd76. > > hbase-20:60030 > > 49373886 54999486 0 > > testtable,54999486,1340846592668.1907f5f6d601720d429f7cac824a82cb. > > hbase-20:60030 > > 54999486 60623755 0 > > testtable,60623755,1340846592668.1ee94c1f78e7e4dd661d5db4a55d49ed.
-
Re: Region splitting problem
Jonathan Bishop 2012-06-28, 15:07
Sorry, that was meant for Benjamin. Also, you can test your key order using Bytes.compareTo(a,b). I believe that is what is used internally (correct me if I am wrong).
On Thu, Jun 28, 2012 at 12:07 AM, Ben Kim <[EMAIL PROTECTED]> wrote:
> Hi :) > > I have a hbase table with rowkeys "1" ~ "10000000" with non-meaningful > cells (each row has one cell that's about 10KB) > I figured that all data was in one region, so I ran a command to split row > at "1000000" > then, hbase automatically started splitting the regions which should have > happened before.(this is ok with me ) > > bigger problem is that for some regions start key is larger than end key. > following is the region info after all auto-split and the hbase became > steady. > if you look at the 4th region and 2nd to the last, start key is lareger > than the endkey. > How is this possible? > > NameRegion ServerStart KeyEnd KeyRequests > testtable,,1340843246470.03a98c01a47d448966e299112915c021. hbase-20:60030 > 10000000 0 > testtable,10000000,1340847028444.d85e18bb7597d8bde91b2819b45a1020. > hbase-20:60030 > 10000000 15624273 0 > testtable,15624273,1340847028444.2b3ec263d7810f883b3c0087f059ff15. > hbase-20:60030 > 15624273 21249208 0 > testtable,21249208,1340846924325.e137e808cb7a4d3e52298f17cc5ae1b9. > hbase-20:60030 > 21249208 2687348 0 > testtable,2687348,1340846924325.fc012ac943efd484e4bbe4bacc81c51a. > hbase-20:60030 > 2687348 3249908 0 > testtable,3249908,1340846817015.0aeddf5bf1cdb073c9b2a0ace5bd709f. > hbase-20:60030 > 3249908 3812335 0 > testtable,3812335,1340846817015.142b3619706dbb36d0338b2a587fd719. > hbase-20:60030 > 3812335 43748950 0 > testtable,43748950,1340846705466.dd18534fabb5b06ec345424d4b174176. > hbase-20:60030 > 43748950 49373886 0 > testtable,49373886,1340846705466.db6abcdeb0525c02b5def7fde147fd76. > hbase-20:60030 > 49373886 54999486 0 > testtable,54999486,1340846592668.1907f5f6d601720d429f7cac824a82cb. > hbase-20:60030 > 54999486 60623755 0 > testtable,60623755,1340846592668.1ee94c1f78e7e4dd661d5db4a55d49ed. > hbase-20:60030 > 60623755 66248691 0 > testtable,66248691,1340846484950.6651e3964a3d55f39f36cd268af666a5. > hbase-20:60030 > 66248691 71873625 0 > testtable,71873625,1340846484950.3a7e621d9c866a4e85816de36c18a6e2. > hbase-20:60030 > 71873625 77499226 0 > testtable,77499226,1340846372518.1ce1f0cb47004bac1f3812e768be7749. > hbase-20:60030 > 77499226 83123496 0 > testtable,83123496,1340846372518.0e76f8ff58b3c3a6c87c6e8116213fa6. > hbase-20:60030 > 83123496 88749097 0 > testtable,88749097,1340846264242.fbfb1ecd4dc57d8a77c2457cd91c64c5. > hbase-20:60030 > 88749097 9437403 0 > testtable,9437403,1340846264242.b92bf9b988416b1cb55bdf297654c2a5. > hbase-20:60030 > 9437403 > 0 > > *Benjamin Kim* > *benkimkimben at gmail* >
|
|