|
|
-
final the dfs.replication and fsck
Patai Sangbutsarakum 2012-10-15, 19:01
Hi Hadoopers,
I have <property> <name>dfs.replication</name> <value>2</value> <final>true</final> </property>
set in hdfs-site.xml in staging environment cluster. while the staging cluster is running the code that will later be deployed in production, those code is trying to have dfs.replication of 3, 10, 50, other than 2; the number that developer thought that will fit in production environment.
Even though I final the property dfs.replication in staging cluster already. every time i run fsck on the staging cluster i still see it said under replication. I thought final keyword will not honor value in job config, but it doesn't seem so when i run fsck.
I am on cdh3u4.
please suggest. Patai
-
Re: final the dfs.replication and fsck
Harsh J 2012-10-15, 19:18
Hi Patai,
Set the dfs.replication.max parameter to 2 to achieve what you want.
On Tue, Oct 16, 2012 at 12:31 AM, Patai Sangbutsarakum <[EMAIL PROTECTED]> wrote: > Hi Hadoopers, > > I have > <property> > <name>dfs.replication</name> > <value>2</value> > <final>true</final> > </property> > > set in hdfs-site.xml in staging environment cluster. while the staging > cluster is running the code that will later be deployed in production, > those code is trying to have dfs.replication of 3, 10, 50, other than > 2; the number that developer thought that will fit in production > environment. > > Even though I final the property dfs.replication in staging cluster > already. every time i run fsck on the staging cluster i still see it > said under replication. > I thought final keyword will not honor value in job config, but it > doesn't seem so when i run fsck. > > I am on cdh3u4. > > please suggest. > Patai
-- Harsh J
-
Re: final the dfs.replication and fsck
Chris Nauroth 2012-10-15, 19:18
Hello Patai,
Has your configuration file change been copied to all nodes in the cluster?
Are there applications connecting from outside of the cluster? If so, then those clients could have separate configuration files or code setting dfs.replication (and other configuration properties). These would not be limited by final declarations in the cluster's configuration files. <final>true</final> controls configuration file resource loading, but it does not necessarily block different nodes or different applications from running with completely different configurations.
Hope this helps, --Chris
On Mon, Oct 15, 2012 at 12:01 PM, Patai Sangbutsarakum < [EMAIL PROTECTED]> wrote:
> Hi Hadoopers, > > I have > <property> > <name>dfs.replication</name> > <value>2</value> > <final>true</final> > </property> > > set in hdfs-site.xml in staging environment cluster. while the staging > cluster is running the code that will later be deployed in production, > those code is trying to have dfs.replication of 3, 10, 50, other than > 2; the number that developer thought that will fit in production > environment. > > Even though I final the property dfs.replication in staging cluster > already. every time i run fsck on the staging cluster i still see it > said under replication. > I thought final keyword will not honor value in job config, but it > doesn't seem so when i run fsck. > > I am on cdh3u4. > > please suggest. > Patai >
-
Re: final the dfs.replication and fsck
Harsh J 2012-10-15, 19:23
Hey Chris,
The dfs.replication param is an exception to the <final> config feature. If one uses the FileSystem API, one can pass in any short value they want the replication to be. This bypasses the configuration, and the configuration (being per-file) is also client sided.
The right way for an administrator to enforce a "max" replication value at a create/setRep level, would be to set the dfs.replication.max to a desired value at the NameNode and restart it.
On Tue, Oct 16, 2012 at 12:48 AM, Chris Nauroth <[EMAIL PROTECTED]> wrote: > Hello Patai, > > Has your configuration file change been copied to all nodes in the cluster? > > Are there applications connecting from outside of the cluster? If so, then > those clients could have separate configuration files or code setting > dfs.replication (and other configuration properties). These would not be > limited by final declarations in the cluster's configuration files. > <final>true</final> controls configuration file resource loading, but it > does not necessarily block different nodes or different applications from > running with completely different configurations. > > Hope this helps, > --Chris > > > On Mon, Oct 15, 2012 at 12:01 PM, Patai Sangbutsarakum > <[EMAIL PROTECTED]> wrote: >> >> Hi Hadoopers, >> >> I have >> <property> >> <name>dfs.replication</name> >> <value>2</value> >> <final>true</final> >> </property> >> >> set in hdfs-site.xml in staging environment cluster. while the staging >> cluster is running the code that will later be deployed in production, >> those code is trying to have dfs.replication of 3, 10, 50, other than >> 2; the number that developer thought that will fit in production >> environment. >> >> Even though I final the property dfs.replication in staging cluster >> already. every time i run fsck on the staging cluster i still see it >> said under replication. >> I thought final keyword will not honor value in job config, but it >> doesn't seem so when i run fsck. >> >> I am on cdh3u4. >> >> please suggest. >> Patai > >
-- Harsh J
|
|