|
|
-
Re: final the dfs.replication and fsck
Patai Sangbutsarakum 2012-10-15, 20:57
Thanks Harsh, dfs.replication.max does do the magic!!
On Mon, Oct 15, 2012 at 1:19 PM, Chris Nauroth <[EMAIL PROTECTED]> wrote: > Thank you, Harsh. I did not know about dfs.replication.max. > > > On Mon, Oct 15, 2012 at 12:23 PM, Harsh J <[EMAIL PROTECTED]> wrote: >> >> Hey Chris, >> >> The dfs.replication param is an exception to the <final> config >> feature. If one uses the FileSystem API, one can pass in any short >> value they want the replication to be. This bypasses the >> configuration, and the configuration (being per-file) is also client >> sided. >> >> The right way for an administrator to enforce a "max" replication >> value at a create/setRep level, would be to set >> the dfs.replication.max to a desired value at the NameNode and restart >> it. >> >> On Tue, Oct 16, 2012 at 12:48 AM, Chris Nauroth >> <[EMAIL PROTECTED]> wrote: >> > Hello Patai, >> > >> > Has your configuration file change been copied to all nodes in the >> > cluster? >> > >> > Are there applications connecting from outside of the cluster? If so, >> > then >> > those clients could have separate configuration files or code setting >> > dfs.replication (and other configuration properties). These would not >> > be >> > limited by final declarations in the cluster's configuration files. >> > <final>true</final> controls configuration file resource loading, but it >> > does not necessarily block different nodes or different applications >> > from >> > running with completely different configurations. >> > >> > Hope this helps, >> > --Chris >> > >> > >> > On Mon, Oct 15, 2012 at 12:01 PM, Patai Sangbutsarakum >> > <[EMAIL PROTECTED]> wrote: >> >> >> >> Hi Hadoopers, >> >> >> >> I have >> >> <property> >> >> <name>dfs.replication</name> >> >> <value>2</value> >> >> <final>true</final> >> >> </property> >> >> >> >> set in hdfs-site.xml in staging environment cluster. while the staging >> >> cluster is running the code that will later be deployed in production, >> >> those code is trying to have dfs.replication of 3, 10, 50, other than >> >> 2; the number that developer thought that will fit in production >> >> environment. >> >> >> >> Even though I final the property dfs.replication in staging cluster >> >> already. every time i run fsck on the staging cluster i still see it >> >> said under replication. >> >> I thought final keyword will not honor value in job config, but it >> >> doesn't seem so when i run fsck. >> >> >> >> I am on cdh3u4. >> >> >> >> please suggest. >> >> Patai >> > >> > >> >> >> >> -- >> Harsh J > >
+
Patai Sangbutsarakum 2012-10-15, 20:57
-
Re: final the dfs.replication and fsck
Patai Sangbutsarakum 2012-10-16, 00:02
Just want to share & check if this is make sense.
Job was failed to run after i restarted the namenode and the cluster stopped complain about under-replication.
this is what i found in log file
Requested replication 10 exceeds maximum 2 java.io.IOException: file /tmp/hadoop-apps/mapred/staging/apps/.staging/job_201210151601_0494/job.jar. Requested replication 10 exceeds maximum 2 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.verifyReplication(FSNamesystem.java:1126) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setReplicationInternal(FSNamesystem.java:1074) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setReplication(FSNamesystem.java:1059) at org.apache.hadoop.hdfs.server.namenode.NameNode.setReplication(NameNode.java:629) at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:557) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1434) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:143 So, i scanned though those xml config files, and guess to change <name>mapred.submit.replication</name> from 10 to 2, and restarted again.
That's when jobs can start running again. Hopefully that change is make sense. Thanks Patai
On Mon, Oct 15, 2012 at 1:57 PM, Patai Sangbutsarakum <[EMAIL PROTECTED]> wrote: > Thanks Harsh, dfs.replication.max does do the magic!! > > On Mon, Oct 15, 2012 at 1:19 PM, Chris Nauroth <[EMAIL PROTECTED]> wrote: >> Thank you, Harsh. I did not know about dfs.replication.max. >> >> >> On Mon, Oct 15, 2012 at 12:23 PM, Harsh J <[EMAIL PROTECTED]> wrote: >>> >>> Hey Chris, >>> >>> The dfs.replication param is an exception to the <final> config >>> feature. If one uses the FileSystem API, one can pass in any short >>> value they want the replication to be. This bypasses the >>> configuration, and the configuration (being per-file) is also client >>> sided. >>> >>> The right way for an administrator to enforce a "max" replication >>> value at a create/setRep level, would be to set >>> the dfs.replication.max to a desired value at the NameNode and restart >>> it. >>> >>> On Tue, Oct 16, 2012 at 12:48 AM, Chris Nauroth >>> <[EMAIL PROTECTED]> wrote: >>> > Hello Patai, >>> > >>> > Has your configuration file change been copied to all nodes in the >>> > cluster? >>> > >>> > Are there applications connecting from outside of the cluster? If so, >>> > then >>> > those clients could have separate configuration files or code setting >>> > dfs.replication (and other configuration properties). These would not >>> > be >>> > limited by final declarations in the cluster's configuration files. >>> > <final>true</final> controls configuration file resource loading, but it >>> > does not necessarily block different nodes or different applications >>> > from >>> > running with completely different configurations. >>> > >>> > Hope this helps, >>> > --Chris >>> > >>> > >>> > On Mon, Oct 15, 2012 at 12:01 PM, Patai Sangbutsarakum >>> > <[EMAIL PROTECTED]> wrote: >>> >> >>> >> Hi Hadoopers, >>> >> >>> >> I have >>> >> <property> >>> >> <name>dfs.replication</name> >>> >> <value>2</value> >>> >> <final>true</final> >>> >> </property> >>> >> >>> >> set in hdfs-site.xml in staging environment cluster. while the staging >>> >> cluster is running the code that will later be deployed in production, >>> >> those code is trying to have dfs.replication of 3, 10, 50, other than >>> >> 2; the number that developer thought that will fit in production >>> >> environment. >>> >> >>> >> Even though I final the property dfs.replication in staging cluster >>> >> already. every time i run fsck on the staging cluster i still see it >>> >> said under replication. >>> >> I thought final keyword will not honor value in job config, but it
+
Patai Sangbutsarakum 2012-10-16, 00:02
-
Re: final the dfs.replication and fsck
Harsh J 2012-10-16, 04:25
Patai,
My bad - that was on my mind but I missed noting it down on my earlier reply. Yes you'd have to control that as well. 2 should be fine for smaller clusters.
On Tue, Oct 16, 2012 at 5:32 AM, Patai Sangbutsarakum <[EMAIL PROTECTED]> wrote: > Just want to share & check if this is make sense. > > Job was failed to run after i restarted the namenode and the cluster > stopped complain about under-replication. > > this is what i found in log file > > Requested replication 10 exceeds maximum 2 > java.io.IOException: file > /tmp/hadoop-apps/mapred/staging/apps/.staging/job_201210151601_0494/job.jar. > Requested replication 10 exceeds maximum 2 > at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.verifyReplication(FSNamesystem.java:1126) > at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setReplicationInternal(FSNamesystem.java:1074) > at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setReplication(FSNamesystem.java:1059) > at org.apache.hadoop.hdfs.server.namenode.NameNode.setReplication(NameNode.java:629) > at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:557) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1434) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:143 > > > So, i scanned though those xml config files, and guess to change > <name>mapred.submit.replication</name> from 10 to 2, and restarted again. > > That's when jobs can start running again. > Hopefully that change is make sense. > > > Thanks > Patai > > On Mon, Oct 15, 2012 at 1:57 PM, Patai Sangbutsarakum > <[EMAIL PROTECTED]> wrote: >> Thanks Harsh, dfs.replication.max does do the magic!! >> >> On Mon, Oct 15, 2012 at 1:19 PM, Chris Nauroth <[EMAIL PROTECTED]> wrote: >>> Thank you, Harsh. I did not know about dfs.replication.max. >>> >>> >>> On Mon, Oct 15, 2012 at 12:23 PM, Harsh J <[EMAIL PROTECTED]> wrote: >>>> >>>> Hey Chris, >>>> >>>> The dfs.replication param is an exception to the <final> config >>>> feature. If one uses the FileSystem API, one can pass in any short >>>> value they want the replication to be. This bypasses the >>>> configuration, and the configuration (being per-file) is also client >>>> sided. >>>> >>>> The right way for an administrator to enforce a "max" replication >>>> value at a create/setRep level, would be to set >>>> the dfs.replication.max to a desired value at the NameNode and restart >>>> it. >>>> >>>> On Tue, Oct 16, 2012 at 12:48 AM, Chris Nauroth >>>> <[EMAIL PROTECTED]> wrote: >>>> > Hello Patai, >>>> > >>>> > Has your configuration file change been copied to all nodes in the >>>> > cluster? >>>> > >>>> > Are there applications connecting from outside of the cluster? If so, >>>> > then >>>> > those clients could have separate configuration files or code setting >>>> > dfs.replication (and other configuration properties). These would not >>>> > be >>>> > limited by final declarations in the cluster's configuration files. >>>> > <final>true</final> controls configuration file resource loading, but it >>>> > does not necessarily block different nodes or different applications >>>> > from >>>> > running with completely different configurations. >>>> > >>>> > Hope this helps, >>>> > --Chris >>>> > >>>> > >>>> > On Mon, Oct 15, 2012 at 12:01 PM, Patai Sangbutsarakum >>>> > <[EMAIL PROTECTED]> wrote: >>>> >> >>>> >> Hi Hadoopers, >>>> >> >>>> >> I have >>>> >> <property> >>>> >> <name>dfs.replication</name> >>>> >> <value>2</value> >>>> >> <final>true</final> >>>> >> </property> >>>> >> >>>> >> set in hdfs-site.xml in staging environment cluster. while the staging >>>> >> cluster is running the code that will later be deployed in production, >>>> >> those code is trying to have dfs.replication of 3, 10, 50, other than
Harsh J
+
Harsh J 2012-10-16, 04:25
-
Re: final the dfs.replication and fsck
Patai Sangbutsarakum 2012-10-16, 07:27
Thanks you so much for confirming that.
On Mon, Oct 15, 2012 at 9:25 PM, Harsh J <[EMAIL PROTECTED]> wrote: > Patai, > > My bad - that was on my mind but I missed noting it down on my earlier > reply. Yes you'd have to control that as well. 2 should be fine for > smaller clusters. > > On Tue, Oct 16, 2012 at 5:32 AM, Patai Sangbutsarakum > <[EMAIL PROTECTED]> wrote: >> Just want to share & check if this is make sense. >> >> Job was failed to run after i restarted the namenode and the cluster >> stopped complain about under-replication. >> >> this is what i found in log file >> >> Requested replication 10 exceeds maximum 2 >> java.io.IOException: file >> /tmp/hadoop-apps/mapred/staging/apps/.staging/job_201210151601_0494/job.jar. >> Requested replication 10 exceeds maximum 2 >> at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.verifyReplication(FSNamesystem.java:1126) >> at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setReplicationInternal(FSNamesystem.java:1074) >> at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setReplication(FSNamesystem.java:1059) >> at org.apache.hadoop.hdfs.server.namenode.NameNode.setReplication(NameNode.java:629) >> at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source) >> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> at java.lang.reflect.Method.invoke(Method.java:597) >> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:557) >> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1434) >> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:143 >> >> >> So, i scanned though those xml config files, and guess to change >> <name>mapred.submit.replication</name> from 10 to 2, and restarted again. >> >> That's when jobs can start running again. >> Hopefully that change is make sense. >> >> >> Thanks >> Patai >> >> On Mon, Oct 15, 2012 at 1:57 PM, Patai Sangbutsarakum >> <[EMAIL PROTECTED]> wrote: >>> Thanks Harsh, dfs.replication.max does do the magic!! >>> >>> On Mon, Oct 15, 2012 at 1:19 PM, Chris Nauroth <[EMAIL PROTECTED]> wrote: >>>> Thank you, Harsh. I did not know about dfs.replication.max. >>>> >>>> >>>> On Mon, Oct 15, 2012 at 12:23 PM, Harsh J <[EMAIL PROTECTED]> wrote: >>>>> >>>>> Hey Chris, >>>>> >>>>> The dfs.replication param is an exception to the <final> config >>>>> feature. If one uses the FileSystem API, one can pass in any short >>>>> value they want the replication to be. This bypasses the >>>>> configuration, and the configuration (being per-file) is also client >>>>> sided. >>>>> >>>>> The right way for an administrator to enforce a "max" replication >>>>> value at a create/setRep level, would be to set >>>>> the dfs.replication.max to a desired value at the NameNode and restart >>>>> it. >>>>> >>>>> On Tue, Oct 16, 2012 at 12:48 AM, Chris Nauroth >>>>> <[EMAIL PROTECTED]> wrote: >>>>> > Hello Patai, >>>>> > >>>>> > Has your configuration file change been copied to all nodes in the >>>>> > cluster? >>>>> > >>>>> > Are there applications connecting from outside of the cluster? If so, >>>>> > then >>>>> > those clients could have separate configuration files or code setting >>>>> > dfs.replication (and other configuration properties). These would not >>>>> > be >>>>> > limited by final declarations in the cluster's configuration files. >>>>> > <final>true</final> controls configuration file resource loading, but it >>>>> > does not necessarily block different nodes or different applications >>>>> > from >>>>> > running with completely different configurations. >>>>> > >>>>> > Hope this helps, >>>>> > --Chris >>>>> > >>>>> > >>>>> > On Mon, Oct 15, 2012 at 12:01 PM, Patai Sangbutsarakum >>>>> > <[EMAIL PROTECTED]> wrote: >>>>> >> >>>>> >> Hi Hadoopers, >>>>> >> >>>>> >> I have >>>>> >> <property> >>>>> >> <name>dfs.replication</name> >>>>> >> <value>2</value> >>>>> >> <final>true</final>
+
Patai Sangbutsarakum 2012-10-16, 07:27
|
|