In our production environment. We encount a problem about the performance of NameNode.
We configure the sharestorge of NameNode with bookkeeper. And our version of hadoop is 2.0.1, bk is 4.1.0.
The problem is: When the hdfs system has run for a while(2-3 days), we found the performance descreased dramatically!
The benchmark with nnbench from hadoop-mapreduce-client-jobclient-2.0.1-tests.jar is like:
./yarn jar ../share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.0.1-tests.jar nnbench -operation create_write -numberOfFiles 10
12/10/20 20:05:43 INFO hdfs.NNBench: TPS: Create/Write/Close: 52
Two days later, we get:
12/10/23 18:34:42 INFO hdfs.NNBench: TPS: Create/Write/Close: 1
//The "Avg exec time (ms): Create/Write/Close:" is even larger, maybe than 1000ms, so the TPS here may be smaller for precision.
And the logs in NameNode, we found the difference from each of the times:
2012-10-20 20:05:43,249 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: **** Number of syncs: 1347 SyncTimes(ms): 14138 3677
2012-10-22 18:34:42,223 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLog: **** Number of syncs: 51 SyncTimes(ms): 34553 312
We inspect that it is the problem of Bookkeeper. Anyone ever encounter that or any clue for that? Thanks very much.
The environment is strictly controlled, and the logs can only be copied by hand. So the logs are not so detailed.
Ivan Kelly 2012-10-24, 15:10