what is the parameter I can use to check more often, like 3 days?
On Mon, Jun 25, 2012 at 7:33 AM, Kai Voigt <[EMAIL PROTECTED]> wrote:
> HDFS has block checksums. Whenever a block is written to the datanodes, a
> checksum is calculated and written with the block to the datanodes' disks.
> Whenever a block is requested, the block's checksum is verified against
> the stored checksum. If they don't match, that block is corrupt. But since
> additional replicas of the block, chances are high one block is matching
> the checksum. Corrupt blocks will be scheduled to be rereplicated.
> Also, to prevent bit rod, blocks are checked periodically (weekly by
> default, I believe, you can configure that period) in the background.
> Am 25.06.2012 um 13:29 schrieb Rita:
> > Does Hadoop, HDFS in particular, do any sanity checks of the file before
> > and after balancing/copying/reading the files? We have 20TB of data and I
> > want to make sure after these operating are completed the data is still
> > good shape. Where can I read about this?
> > tia
> > --
> > --- Get your facts first, then you can distort them as you please.--
> Kai Voigt
> [EMAIL PROTECTED]
--- Get your facts first, then you can distort them as you please.--