Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS, mail # dev - [Important] Checksum Error while appending to the file.


Copy link to this message
-
[Important] Checksum Error while appending to the file.
Vinayakumar B 2012-02-29, 10:29
Hi All,

In one of our hadoop cluster we faced CheckSum file Corruption, due to which appending to the file failed.

If any one of you faced this problem earlier, please share your experiences.

We are using hadoop 0.20.1 with append feature.

Scenario:
==============1. Created the file, written 305 bytes, closed the Stream.

2. Called append to same file and written 307 bytes and closed the stream.

3. Repeated the Step 2 with different bytes (311, 313, 307, 305, 313, 311, 307, 311, 313, 307, 307, 307, 305, 307, 305, 290, 288, 305, 307, 307, 307, 290);

4. Now again Step 2 is repeated with 294 bytes. Now pipeline was {xxx.xxx.xxx.106:50010, xxx.xxx.xxx.xxx:10010}
Now file length becomes 7629. And stream is closed.

Here checksum will be verified by the Last DataNode in the pipeline for every packet received.
If verification fails then Exception will be thrown.

Since There is no exception in any of the DataNode Logs, Checksum verification should be successful. And meta file size should be 67 bytes.

Meta File should contain 15 checksum bytes and 7 header bytes. 7629/512=14 checksums for full chunks and 1 checksum for partial chunk.

5. Now Again append to the same file is called, Now append fails because of the recovery failure at DataNodes due to below Exception.

java.io.IOException: Block blk_1329468764084_188363 is of size 7629 but has 17 checksums and each checksum size is 4 bytes.
 at org.apache.hadoop.hdfs.server.datanode.FSDataset.validateBlockMetadata(FSDataset.java:1922)
 at org.apache.hadoop.hdfs.server.datanode.FSDataset.startBlockRecovery(FSDataset.java:2142)
 at org.apache.hadoop.hdfs.server.datanode.DataNode.startBlockRecovery(DataNode.java:2078)
 at sun.reflect.GeneratedMethodAccessor29.invoke(Unknown Source)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1139)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1135)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1133)
Here metafile length is 17*4+7=75 bytes. But it should be 67 bytes according to step 5.
Data block size in Step 4 and Step 5 are matching, but metafile sizes are not matching.

Thanks and Regards,
Vinayakumar B