Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> RE: how to handle the corrupt block in HDFS?


Copy link to this message
-
Re: how to handle the corrupt block in HDFS?
thanks for reply,but if the block just has  1 corrupt replica,hdfs fsck can
not tell you which block of which file has a replica been corrupted,fsck
just useful on all of one block's replica bad

On Wed, Dec 11, 2013 at 10:01 AM, Adam Kawa <[EMAIL PROTECTED]> wrote:

> When you identify a file with corrupt block(s), then you can locate the
> machines that stores its block by typing
> $ sudo -u hdfs hdfs fsck <path-to-file> -files -blocks -locations
>
>
> 2013/12/11 Adam Kawa <[EMAIL PROTECTED]>
>
>> Maybe this can work for you
>> $ sudo -u hdfs hdfs fsck / -list-corruptfileblocks
>> ?
>>
>>
>> 2013/12/11 ch huang <[EMAIL PROTECTED]>
>>
>>> thanks for reply, what i do not know is how can i locate the block which
>>> has the corrupt replica,(so i can observe how long the corrupt replica will
>>> be removed and a new health replica replace it,because i get nagios alert
>>> for three days,i do not sure if it is the same corrupt replica cause the
>>> alert ,and i do not know the interval of hdfs check corrupt replica and
>>> clean it)
>>>
>>>
>>> On Tue, Dec 10, 2013 at 6:20 PM, Vinayakumar B <[EMAIL PROTECTED]
>>> > wrote:
>>>
>>>>  Hi ch huang,
>>>>
>>>>
>>>>
>>>> It may seem strange, but the fact is,
>>>>
>>>> *CorruptBlocks* through JMX means *“Number of blocks with corrupt
>>>> replicas”. May not be all replicas are corrupt.  *This you can check
>>>> though jconsole for description.
>>>>
>>>>
>>>>
>>>> Where as *Corrupt blocks* through fsck means, *blocks with all
>>>> replicas corrupt(non-recoverable)/ missing.*
>>>>
>>>>
>>>>
>>>> In your case, may be one of the replica is corrupt, not all replicas of
>>>> same block. This corrupt replica will be deleted automatically if one more
>>>> datanode available in your cluster and block replicated to that.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Related to replication 10, As Peter Marron said, *some of the
>>>> important files of the mapreduce job will set the replication of 10, to
>>>> make it accessible faster and launch map tasks faster. *
>>>>
>>>> Anyway, if the job is success these files will be deleted auomatically.
>>>> I think only in some cases if the jobs are killed in between these files
>>>> will remain in hdfs showing underreplicated blocks.
>>>>
>>>>
>>>>
>>>> Thanks and Regards,
>>>>
>>>> Vinayakumar B
>>>>
>>>>
>>>>
>>>> *From:* Peter Marron [mailto:[EMAIL PROTECTED]]
>>>> *Sent:* 10 December 2013 14:19
>>>> *To:* [EMAIL PROTECTED]
>>>> *Subject:* RE: how to handle the corrupt block in HDFS?
>>>>
>>>>
>>>>
>>>> Hi,
>>>>
>>>>
>>>>
>>>> I am sure that there are others who will answer this better, but anyway.
>>>>
>>>> The default replication level for files in HDFS is 3 and so most files
>>>> that you
>>>>
>>>> see will have a replication level of 3. However when you run a
>>>> Map/Reduce
>>>>
>>>> job the system knows in advance that every node will need a copy of
>>>>
>>>> certain files. Specifically the job.xml and the various jars containing
>>>>
>>>> classes that will be needed to run the mappers and reducers. So the
>>>>
>>>> system arranges that some of these files have a higher replication
>>>> level. This increases
>>>>
>>>> the chances that a copy will be found locally.
>>>>
>>>> By default this higher replication level is 10.
>>>>
>>>>
>>>>
>>>> This can seem a little odd on a cluster where you only have, say, 3
>>>> nodes.
>>>>
>>>> Because it means that you will almost always have some blocks that are
>>>> marked
>>>>
>>>> under-replicated. I think that there was some discussion a while back
>>>> to change
>>>>
>>>> this to make the replication level something like min(10, #number of
>>>> nodes)
>>>>
>>>> However, as I recall, the general consensus was that this was extra
>>>>
>>>> complexity that wasn’t really worth it. If it ain’t broke…
>>>>
>>>>
>>>>
>>>> Hope that this helps.
>>>>
>>>>
>>>>
>>>> *Peter Marron*
>>>>
>>>> Senior Developer, Research & Development
>>>>
>>>>
>>>>
>>>> Office: +44 *(0) 118-940-7609*  [EMAIL PROTECTED]