|
|
-
Fixing Mis-replicated blocks
John Meagher 2011-10-20, 15:44
After a hardware move with an unfortunate mis-setup rack awareness script our hadoop cluster has a large number of mis-replicated blocks. After about a week things haven't gotten better on their own.
Is there a good way to trigger the name node to fix the mis-replicated blocks?
Here's what I'm using for now, but it is very slow: for f in `hadoop fsck / | grep "Replica placement policy is violated" | head -n3000 | awk -F: '{print $1}'`; do hadoop fs -setrep 4 $f hadoop fs -setrep 3 $f done
John
+
John Meagher 2011-10-20, 15:44
-
Re: Fixing Mis-replicated blocks
Jeff Bean 2011-10-21, 00:26
Do setrep -w on the increase to force the new replica before decreasing again.
Of course, the little script only works if the replication factor is 3 on all the files. If it's a variable amount you should use the java API to get the existing factor and then increase by one and then decrease.
Jeff
On Thu, Oct 20, 2011 at 8:44 AM, John Meagher <[EMAIL PROTECTED]>wrote:
> After a hardware move with an unfortunate mis-setup rack awareness > script our hadoop cluster has a large number of mis-replicated blocks. > After about a week things haven't gotten better on their own. > > Is there a good way to trigger the name node to fix the mis-replicated > blocks? > > Here's what I'm using for now, but it is very slow: > for f in `hadoop fsck / | grep "Replica placement policy is violated" > | head -n3000 | awk -F: '{print $1}'`; do > hadoop fs -setrep 4 $f > hadoop fs -setrep 3 $f > done > > John >
+
Jeff Bean 2011-10-21, 00:26
-
Re: Fixing Mis-replicated blocks
John Meagher 2011-10-21, 17:24
In this case everything should be 3. I was hoping there was a quicker way. The -w option should help so this doesn't need to be run again.
On Thu, Oct 20, 2011 at 20:26, Jeff Bean <[EMAIL PROTECTED]> wrote: > Do setrep -w on the increase to force the new replica before decreasing > again. > > Of course, the little script only works if the replication factor is 3 on > all the files. If it's a variable amount you should use the java API to get > the existing factor and then increase by one and then decrease. > > Jeff > > On Thu, Oct 20, 2011 at 8:44 AM, John Meagher <[EMAIL PROTECTED]>wrote: > >> After a hardware move with an unfortunate mis-setup rack awareness >> script our hadoop cluster has a large number of mis-replicated blocks. >> After about a week things haven't gotten better on their own. >> >> Is there a good way to trigger the name node to fix the mis-replicated >> blocks? >> >> Here's what I'm using for now, but it is very slow: >> for f in `hadoop fsck / | grep "Replica placement policy is violated" >> | head -n3000 | awk -F: '{print $1}'`; do >> hadoop fs -setrep 4 $f >> hadoop fs -setrep 3 $f >> done >> >> John >> >
+
John Meagher 2011-10-21, 17:24
|
|
All projects made searchable here are trademarks of the Apache Software Foundation.
Service operated by
Sematext