Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> Re: final the dfs.replication and fsck


+
Patai Sangbutsarakum 2012-10-15, 20:57
+
Patai Sangbutsarakum 2012-10-16, 00:02
+
Harsh J 2012-10-16, 04:25
Copy link to this message
-
Re: final the dfs.replication and fsck
Thanks you so much for confirming that.

On Mon, Oct 15, 2012 at 9:25 PM, Harsh J <[EMAIL PROTECTED]> wrote:
> Patai,
>
> My bad - that was on my mind but I missed noting it down on my earlier
> reply. Yes you'd have to control that as well. 2 should be fine for
> smaller clusters.
>
> On Tue, Oct 16, 2012 at 5:32 AM, Patai Sangbutsarakum
> <[EMAIL PROTECTED]> wrote:
>> Just want to share & check if this is make sense.
>>
>> Job was failed to run after i restarted the namenode and the cluster
>> stopped complain about under-replication.
>>
>> this is what i found in log file
>>
>> Requested replication 10 exceeds maximum 2
>> java.io.IOException: file
>> /tmp/hadoop-apps/mapred/staging/apps/.staging/job_201210151601_0494/job.jar.
>> Requested replication 10 exceeds maximum 2
>>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.verifyReplication(FSNamesystem.java:1126)
>>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setReplicationInternal(FSNamesystem.java:1074)
>>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setReplication(FSNamesystem.java:1059)
>>         at org.apache.hadoop.hdfs.server.namenode.NameNode.setReplication(NameNode.java:629)
>>         at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
>>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>         at java.lang.reflect.Method.invoke(Method.java:597)
>>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:557)
>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1434)
>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:143
>>
>>
>> So, i scanned though those xml config files, and guess to change
>> <name>mapred.submit.replication</name> from 10 to 2, and restarted again.
>>
>> That's when jobs can start running again.
>> Hopefully that change is make sense.
>>
>>
>> Thanks
>> Patai
>>
>> On Mon, Oct 15, 2012 at 1:57 PM, Patai Sangbutsarakum
>> <[EMAIL PROTECTED]> wrote:
>>> Thanks Harsh, dfs.replication.max does do the magic!!
>>>
>>> On Mon, Oct 15, 2012 at 1:19 PM, Chris Nauroth <[EMAIL PROTECTED]> wrote:
>>>> Thank you, Harsh.  I did not know about dfs.replication.max.
>>>>
>>>>
>>>> On Mon, Oct 15, 2012 at 12:23 PM, Harsh J <[EMAIL PROTECTED]> wrote:
>>>>>
>>>>> Hey Chris,
>>>>>
>>>>> The dfs.replication param is an exception to the <final> config
>>>>> feature. If one uses the FileSystem API, one can pass in any short
>>>>> value they want the replication to be. This bypasses the
>>>>> configuration, and the configuration (being per-file) is also client
>>>>> sided.
>>>>>
>>>>> The right way for an administrator to enforce a "max" replication
>>>>> value at a create/setRep level, would be to set
>>>>> the dfs.replication.max to a desired value at the NameNode and restart
>>>>> it.
>>>>>
>>>>> On Tue, Oct 16, 2012 at 12:48 AM, Chris Nauroth
>>>>> <[EMAIL PROTECTED]> wrote:
>>>>> > Hello Patai,
>>>>> >
>>>>> > Has your configuration file change been copied to all nodes in the
>>>>> > cluster?
>>>>> >
>>>>> > Are there applications connecting from outside of the cluster?  If so,
>>>>> > then
>>>>> > those clients could have separate configuration files or code setting
>>>>> > dfs.replication (and other configuration properties).  These would not
>>>>> > be
>>>>> > limited by final declarations in the cluster's configuration files.
>>>>> > <final>true</final> controls configuration file resource loading, but it
>>>>> > does not necessarily block different nodes or different applications
>>>>> > from
>>>>> > running with completely different configurations.
>>>>> >
>>>>> > Hope this helps,
>>>>> > --Chris
>>>>> >
>>>>> >
>>>>> > On Mon, Oct 15, 2012 at 12:01 PM, Patai Sangbutsarakum
>>>>> > <[EMAIL PROTECTED]> wrote:
>>>>> >>
>>>>> >> Hi Hadoopers,
>>>>> >>
>>>>> >> I have
>>>>> >> <property>
>>>>> >>     <name>dfs.replication</name>
>>>>> >>     <value>2</value>
>>>>> >>     <final>true</final>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB