Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> High IO Usage in Datanodes due to Replication


Copy link to this message
-
Re: High IO Usage in Datanodes due to Replication
The block scanner is a simple, independent operation of the DN that
runs periodically and does work in small phases, to ensure that no
blocks exist that aren't matching their checksums (its an automatic
data validator) - such that it may report corrupt/rotting blocks and
keep the cluster healthy.

Its runtime shouldn't cause any issues, unless your DN has a lot of
blocks (more than normal due to overload of small, inefficient files)
but too little heap size to perform retention plus block scanning.

> 1. Is data node will not allow to write the data during DataBlockScanning process ?

No such thing. As I said, its independent and mostly lock free. Writes
or reads are not hampered.

> 2. Is data node will come normal only when "Not yet verified" come to zero in data node blockScannerReport ?

Yes, but note that this runs over and over again (once every 3 weeks IIRC).

On Wed, May 1, 2013 at 11:33 AM, selva <[EMAIL PROTECTED]> wrote:
> Thanks Harsh & Manoj for the inputs.
>
> Now i found that the data node is busy with block scanning. I have TBs data
> attached with each data node. So its taking days to complete the data block
> scanning. I have two questions.
>
> 1. Is data node will not allow to write the data during DataBlockScanning
> process ?
>
> 2. Is data node will come normal only when "Not yet verified" come to zero
> in data node blockScannerReport ?
>
> # Data node logs
>
> 2013-05-01 05:53:50,639 INFO
> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
> succeeded for blk_-7605405041820244736_20626608
> 2013-05-01 05:53:50,664 INFO
> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
> succeeded for blk_-1425088964531225881_20391711
> 2013-05-01 05:53:50,692 INFO
> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
> succeeded for blk_2259194263704433881_10277076
> 2013-05-01 05:53:50,740 INFO
> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
> succeeded for blk_2653195657740262633_18315696
> 2013-05-01 05:53:50,818 INFO
> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
> succeeded for blk_-5124560783595402637_20821252
> 2013-05-01 05:53:50,866 INFO
> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
> succeeded for blk_6596021414426970798_19649117
> 2013-05-01 05:53:50,931 INFO
> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
> succeeded for blk_7026400040099637841_20741138
> 2013-05-01 05:53:50,992 INFO
> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
> succeeded for blk_8535358360851622516_20694185
> 2013-05-01 05:53:51,057 INFO
> org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification
> succeeded for blk_7959856580255809601_20559830
>
> # One of my Data node block scanning report
>
> http://<datanode-host>:15075/blockScannerReport
>
> Total Blocks                 : 2037907
> Verified in last hour        :   4819
> Verified in last day         : 107355
> Verified in last week        : 686873
> Verified in last four weeks  : 1589964
> Verified in SCAN_PERIOD      : 1474221
> Not yet verified             : 447943
> Verified since restart       : 318433
> Scans since restart          : 318058
> Scan errors since restart    :      0
> Transient scan errors        :      0
> Current scan rate limit KBps :   3205
> Progress this period         :    101%
> Time left in cur period      :  86.02%
>
> Thanks
> Selva
>
>
> -----Original Message-----
> From "S, Manoj" <[EMAIL PROTECTED]>
> Subject RE: High IO Usage in Datanodes due to Replication
> Date Mon, 29 Apr 2013 06:41:31 GMT
> Adding to Harsh's comments:
>
> You can also tweak a few OS level parameters to improve the I/O performance.
> 1) Mount the filesystem with "noatime" option.
> 2) Check if changing the IO scheduling the algorithm will improve the
> cluster's performance.
> (Check this file /sys/block/<device_name>/queue/scheduler)
> 3) If there are lots of I/O requests and your cluster hangs because of that,

Harsh J