Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> how to handle the corrupt block in HDFS?


+
ch huang 2013-12-10, 00:32
+
ch huang 2013-12-10, 01:15
+
ch huang 2013-12-10, 01:20
+
ch huang 2013-12-11, 01:18
+
Vinayakumar B 2013-12-11, 01:21
+
Adam Kawa 2013-12-11, 18:33
+
ch huang 2013-12-12, 00:44
Copy link to this message
-
Re: how to handle the corrupt block in HDFS?
and is fsck report data from BlockPoolSliceScanner? it seems run once each
3 weeks
can i restart DN one by one without interrupt the job which is running?

On Thu, Dec 12, 2013 at 2:33 AM, Adam Kawa <[EMAIL PROTECTED]> wrote:

>  I have only 1-node cluster, so I am not able to verify it when
> replication factor is bigger than 1.
>
>  I run the fsck on a file that consists of 3 blocks, and 1 block has a
> corrupt replica. fsck told that the system is HEALTHY.
>
> When I restarted the DN, then the block scanner (BlockPoolSliceScanner)
> started and it detected a corrupted replica. Then I run fsck again on that
> file, and it told me that the system is CORRUPT.
>
> If you have a small (and non-production) cluster, can you restart your
> datandoes and run fsck again?
>
>
>
> 2013/12/11 ch huang <[EMAIL PROTECTED]>
>
>> thanks for reply,but if the block just has  1 corrupt replica,hdfs fsck
>> can not tell you which block of which file has a replica been
>> corrupted,fsck just useful on all of one block's replica bad
>>
>> On Wed, Dec 11, 2013 at 10:01 AM, Adam Kawa <[EMAIL PROTECTED]> wrote:
>>
>>> When you identify a file with corrupt block(s), then you can locate the
>>> machines that stores its block by typing
>>> $ sudo -u hdfs hdfs fsck <path-to-file> -files -blocks -locations
>>>
>>>
>>> 2013/12/11 Adam Kawa <[EMAIL PROTECTED]>
>>>
>>>> Maybe this can work for you
>>>> $ sudo -u hdfs hdfs fsck / -list-corruptfileblocks
>>>> ?
>>>>
>>>>
>>>> 2013/12/11 ch huang <[EMAIL PROTECTED]>
>>>>
>>>>> thanks for reply, what i do not know is how can i locate the block
>>>>> which has the corrupt replica,(so i can observe how long the corrupt
>>>>> replica will be removed and a new health replica replace it,because i get
>>>>> nagios alert for three days,i do not sure if it is the same corrupt replica
>>>>> cause the alert ,and i do not know the interval of hdfs check corrupt
>>>>> replica and clean it)
>>>>>
>>>>>
>>>>> On Tue, Dec 10, 2013 at 6:20 PM, Vinayakumar B <
>>>>> [EMAIL PROTECTED]> wrote:
>>>>>
>>>>>>  Hi ch huang,
>>>>>>
>>>>>>
>>>>>>
>>>>>> It may seem strange, but the fact is,
>>>>>>
>>>>>> *CorruptBlocks* through JMX means *“Number of blocks with corrupt
>>>>>> replicas”. May not be all replicas are corrupt.  *This you can check
>>>>>> though jconsole for description.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Where as *Corrupt blocks* through fsck means, *blocks with all
>>>>>> replicas corrupt(non-recoverable)/ missing.*
>>>>>>
>>>>>>
>>>>>>
>>>>>> In your case, may be one of the replica is corrupt, not all replicas
>>>>>> of same block. This corrupt replica will be deleted automatically if one
>>>>>> more datanode available in your cluster and block replicated to that.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Related to replication 10, As Peter Marron said, *some of the
>>>>>> important files of the mapreduce job will set the replication of 10, to
>>>>>> make it accessible faster and launch map tasks faster. *
>>>>>>
>>>>>> Anyway, if the job is success these files will be deleted
>>>>>> auomatically. I think only in some cases if the jobs are killed in between
>>>>>> these files will remain in hdfs showing underreplicated blocks.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Thanks and Regards,
>>>>>>
>>>>>> Vinayakumar B
>>>>>>
>>>>>>
>>>>>>
>>>>>> *From:* Peter Marron [mailto:[EMAIL PROTECTED]]
>>>>>> *Sent:* 10 December 2013 14:19
>>>>>> *To:* [EMAIL PROTECTED]
>>>>>> *Subject:* RE: how to handle the corrupt block in HDFS?
>>>>>>
>>>>>>
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>>
>>>>>>
>>>>>> I am sure that there are others who will answer this better, but
>>>>>> anyway.
>>>>>>
>>>>>> The default replication level for files in HDFS is 3 and so most
>>>>>> files that you
>>>>>>
>>>>>> see will have a replication level of 3. However when you run a
>>>>>> Map/Reduce
>>>>>>
>>>>>> job the system knows in advance that every node will need a copy of
>>>>>>
>>>>>> certain files. Specifically the job.xml and the various jars
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB