|
|
-
a question about append operation of HFile.Writer
yonghu 2012-03-06, 17:01
Hello,
One HFile consists of many blocks. Suppose we have two blocks, b1 and b2. The size of each block is 2K. In b1, we have two key-value pairs, whose keys are t1 and t2, separately. Each key-value pair is 1K. So the b1 is full. Suppose that now we insert a new tuple which key is also t1. The HBase will call the HFile.Writer.append(kv) method to insert t1 into b2, because b1 is full. This is how I understand HFile.Writer.append() operation. But when I write a test program as follows:
//block size is 2K HFile.Writer hwriter = new HFile.Writer(fs, new Path("hdfs://localhost:8020/huyong/test"), 2, (Compression.Algorithm)null, null);
//key-value 1K byte[] key2 = "Bim".getBytes(); byte[] value2 = new byte[1024]; for(int i=0;i<1024;i++){ value2[i] = 'b'; } hwriter.append(key2, value2); byte[] key3 = "Cim".getBytes(); byte[] value3 = new byte[1024]; for(int i=0;i<1024;i++){ value3[i] = 'c'; } hwriter.append(key3, value3); byte[] key4 = "Bim".getBytes(); byte[] value4 = new byte[1024]; for(int i=0;i<1024;i++){ value4[i] = 'b'; } hwriter.append(key4, value4);
Then, I get the following error information: Added a key not lexically larger than previous key=Bim, lastkey=Cim
So it seems that the key order in all blocks is ascendant, the key in the first block is smaller than the second one. But if I want to insert a tuple which key is as same as the previous block, but previous block has no room for that key-value, what will happen?
Thanks!
Yong
-
Re: a question about append operation of HFile.Writer
Mikael Sitruk 2012-03-06, 19:26
Hi, I think that your flow for test is not correct The Hfile is immutable, it is (from my understanding ) the result of the memstore which is sorted lexicographically. The hfile is created at flush time. Hfile may have different size, but with compaction running they are merged and sorted (mergesort). So your way of testing does not reflect the way hbase runs Mikael.s On Mar 6, 2012 7:01 PM, "yonghu" <[EMAIL PROTECTED]> wrote:
> Hello, > > One HFile consists of many blocks. Suppose we have two blocks, b1 and > b2. The size of each block is 2K. In b1, we have two key-value pairs, > whose keys are t1 and t2, separately. Each key-value pair is 1K. So > the b1 is full. Suppose that now we insert a new tuple which key is > also t1. The HBase will call the HFile.Writer.append(kv) method to > insert t1 into b2, because b1 is full. This is how I understand > HFile.Writer.append() operation. But when I write a test program as > follows: > > //block size is 2K > HFile.Writer hwriter = new HFile.Writer(fs, new > Path("hdfs://localhost:8020/huyong/test"), 2, > (Compression.Algorithm)null, null); > > //key-value 1K > byte[] key2 = "Bim".getBytes(); > byte[] value2 = new byte[1024]; > for(int i=0;i<1024;i++){ > value2[i] = 'b'; > } > hwriter.append(key2, value2); > > byte[] key3 = "Cim".getBytes(); > byte[] value3 = new byte[1024]; > for(int i=0;i<1024;i++){ > value3[i] = 'c'; > } > hwriter.append(key3, value3); > > byte[] key4 = "Bim".getBytes(); > byte[] value4 = new byte[1024]; > for(int i=0;i<1024;i++){ > value4[i] = 'b'; > } > hwriter.append(key4, value4); > > Then, I get the following error information: > Added a key not lexically larger than previous key=Bim, lastkey=Cim > > So it seems that the key order in all blocks is ascendant, the key in > the first block is smaller than the second one. But if I want to > insert a tuple which key is as same as the previous block, but > previous block has no room for that key-value, what will happen? > > Thanks! > > Yong >
|
|