Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> a question about append operation of HFile.Writer


Copy link to this message
-
a question about append operation of HFile.Writer
Hello,

One HFile consists of many blocks. Suppose we have two blocks, b1 and
b2. The size of each block is 2K. In b1, we have two key-value pairs,
whose keys are t1 and t2, separately. Each key-value pair is 1K. So
the b1 is full. Suppose that now we insert a new tuple which key is
also t1. The HBase will call the HFile.Writer.append(kv) method to
insert t1 into b2, because b1 is full. This is how I understand
HFile.Writer.append() operation. But when I write a test program as
follows:

//block size is 2K
HFile.Writer hwriter = new HFile.Writer(fs, new
Path("hdfs://localhost:8020/huyong/test"), 2,
(Compression.Algorithm)null, null);

//key-value 1K
byte[] key2 = "Bim".getBytes();
byte[] value2 = new byte[1024];
      for(int i=0;i<1024;i++){
value2[i] = 'b';
}
hwriter.append(key2, value2);

byte[] key3 = "Cim".getBytes();
byte[] value3 = new byte[1024];
for(int i=0;i<1024;i++){
value3[i] = 'c';
}
hwriter.append(key3, value3);

byte[] key4 = "Bim".getBytes();
byte[] value4 = new byte[1024];
         for(int i=0;i<1024;i++){
value4[i] = 'b';
}
hwriter.append(key4, value4);

Then, I get the following error information:
  Added a key not lexically larger than previous key=Bim, lastkey=Cim

So it seems that the key order in all blocks is ascendant, the key in
the first block is smaller than the second one.  But if I want to
insert a tuple which key is as same as the previous block, but
previous block has no room for that key-value, what will happen?

Thanks!

Yong