Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - a question about append operation of HFile.Writer

Copy link to this message
a question about append operation of HFile.Writer
yonghu 2012-03-06, 17:01

One HFile consists of many blocks. Suppose we have two blocks, b1 and
b2. The size of each block is 2K. In b1, we have two key-value pairs,
whose keys are t1 and t2, separately. Each key-value pair is 1K. So
the b1 is full. Suppose that now we insert a new tuple which key is
also t1. The HBase will call the HFile.Writer.append(kv) method to
insert t1 into b2, because b1 is full. This is how I understand
HFile.Writer.append() operation. But when I write a test program as

//block size is 2K
HFile.Writer hwriter = new HFile.Writer(fs, new
Path("hdfs://localhost:8020/huyong/test"), 2,
(Compression.Algorithm)null, null);

//key-value 1K
byte[] key2 = "Bim".getBytes();
byte[] value2 = new byte[1024];
      for(int i=0;i<1024;i++){
value2[i] = 'b';
hwriter.append(key2, value2);

byte[] key3 = "Cim".getBytes();
byte[] value3 = new byte[1024];
for(int i=0;i<1024;i++){
value3[i] = 'c';
hwriter.append(key3, value3);

byte[] key4 = "Bim".getBytes();
byte[] value4 = new byte[1024];
         for(int i=0;i<1024;i++){
value4[i] = 'b';
hwriter.append(key4, value4);

Then, I get the following error information:
  Added a key not lexically larger than previous key=Bim, lastkey=Cim

So it seems that the key order in all blocks is ascendant, the key in
the first block is smaller than the second one.  But if I want to
insert a tuple which key is as same as the previous block, but
previous block has no room for that key-value, what will happen?