-Re: Hadoop doesnt use Replication Level of Namenode
Edward Capriolo 2011-09-13, 14:56
On Tue, Sep 13, 2011 at 5:53 AM, Steve Loughran <[EMAIL PROTECTED]> wrote:
> On 13/09/11 05:02, Harsh J wrote:
>> There is no current way to 'fetch' a config at the moment. You have
>> the NameNode's config available at NNHOST:WEBPORT/conf page which you
>> can perhaps save as a resource (dynamically) and load into your
>> Configuration instance, but apart from this hack the only other ways
>> are the ones Bharath mentioned. This might lead to slow start ups of
>> your clients, but would give you the result you want.
> I've done it a modified version of Hadoop, all it takes is a servlet in the
> NN. It even served up the live data of the addresses and ports a NN was
> running on, even if it didn't know in advance.
Another technique is that if you are using a single replication factor on
all files you can mark the property as <final>true</final> in the
configuration of the NameNode and DataNode. This will always override the
client settings. However in general it is best to manage client
configurations as carefully as you manage the server ones, and ensure that
you give clients the configuration they MUST use puppet/cfengine etc.
Essentially do not count on a client to get them right because the risk is
too high if they are set wrong. IE your situation. "I thought everything was
replicated 3 times"