Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # dev >> DataBlockScanner scan period


Copy link to this message
-
Re: DataBlockScanner scan period
Hi Thanh,

That is correct.  Last time I read the code, Hadoop scheduled the block verifications randomly throughout the period in order to avoid periodic effects (i.e., high load every N minutes).

Brian

On Oct 13, 2010, at 7:14 PM, Thanh Do wrote:

> Brian,
>
> When you say *attempt* to complete and *entire* node scan,
> you mean for example, if a node has 100 block files, it will
> try to verify all 100 block every 3 weeks?
> That is in average, a block is scanned every (3 weeks / 100 time interval)?
>
> Thanks
> Thanh
>
>
> On Wed, Oct 13, 2010 at 7:07 PM, Brian Bockelman <[EMAIL PROTECTED]>wrote:
>
>> Hi Thanh,
>>
>> The scan period is the period that hadoop *attempts* to complete an entire
>> node scan.  That is, if it's set to 3 weeks, HDFS will try to scan each
>> block once every 3 weeks.
>>
>> Obviously, depending on the bandwidth you have made available to the
>> scanning thread, you can specify impossibly small periods.
>>
>> Brian
>>
>> On Oct 13, 2010, at 7:01 PM, Thanh Do wrote:
>>
>>> Hi again,
>>>
>>> Could any body explain to me about the scanning period
>>> policy of DataBlockScanner? That is who often it wake up
>>> and scan a block file.
>>> When looking at the code, I found
>>>
>>> static final long DEFAULT_SCAN_PERIOD_HOURS = 21*24L; // three weeks
>>>
>>>
>>> but definitely it does not wake up and pick a random block
>>> to verify every three weeks, right?
>>>
>>> Thanks a lot,
>>> Thanh
>>
>>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB