Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop, mail # user - file checksum


+
Rita 2012-06-25, 11:29
+
Kai Voigt 2012-06-25, 11:33
Copy link to this message
-
Re: file checksum
Rita 2012-06-26, 00:46
what is the parameter I can use to check more often, like 3 days?

On Mon, Jun 25, 2012 at 7:33 AM, Kai Voigt <[EMAIL PROTECTED]> wrote:

> HDFS has block checksums. Whenever a block is written to the datanodes, a
> checksum is calculated and written with the block to the datanodes' disks.
>
> Whenever a block is requested, the block's checksum is verified against
> the stored checksum. If they don't match, that block is corrupt. But since
> there's
> additional replicas of the block, chances are high one block is matching
> the checksum. Corrupt blocks will be scheduled to be rereplicated.
>
> Also, to prevent bit rod, blocks are checked periodically (weekly by
> default, I believe, you can configure that period) in the background.
>
> Kai
>
> Am 25.06.2012 um 13:29 schrieb Rita:
>
> > Does Hadoop, HDFS in particular, do any sanity checks of the file before
> > and after balancing/copying/reading the files? We have 20TB of data and I
> > want to make sure after these operating are completed the data is still
> in
> > good shape. Where can I read about this?
> >
> > tia
> >
> > --
> > --- Get your facts first, then you can distort them as you please.--
>
> --
> Kai Voigt
> [EMAIL PROTECTED]
>
>
>
>
>
--
--- Get your facts first, then you can distort them as you please.--