|
|
Steve Cohen 2011-05-11, 20:59
Hello,
We are running an hdfs cluster and we decided we wanted to add a new datanode. Since we are using a virtual machine, we just cloned an existing datanode. We added it to the slaves list and started up the cluster. We started getting log messages like this in the namenode log:
2011-05-11 15:59:44,148 ERROR hdfs.StateChange - BLOCK* NameSystem.getDatanode: Data node 10.104.211.58:50010 is attempting to report storage ID DS-1360904153-10.104.211.57-50010-1293288346692. Node 10.104.211.57:50010 is expected to serve this storage. 2011-05-11 15:59:46,975 ERROR hdfs.StateChange - BLOCK* NameSystem.getDatanode: Data node 10.104.211.57:50010 is attempting to report storage ID DS-1360904153-10.104.211.57-50010-1293288346692. Node 10.104.211.58:50010 is expected to serve this storage.
I understand that this is because the datanodes have the exact same information so the first data node that connects has precedence.
Is it possible to just wipe one of the datanodes so it is blank or do we have to format the entire hdfs filesystem from the namenode to add the new datanode.
Thanks, Steve Cohen
Jeff Bean 2011-05-11, 21:02
If I understand correctly, datanode reports its blocks based on the contents of dfs.data.dir.
When you cloned the data node, you cloned all of its blocks as well.
When you add a "fresh" datanode to the cluster, you add one that has an empty dfs.data.dir.
Try clearing out dfs.data.dir before adding the new node.
Jeff On Wed, May 11, 2011 at 1:59 PM, Steve Cohen <[EMAIL PROTECTED]> wrote:
> Hello, > > We are running an hdfs cluster and we decided we wanted to add a new > datanode. Since we are using a virtual machine, we just cloned an existing > datanode. We added it to the slaves list and started up the cluster. We > started getting log messages like this in the namenode log: > > 2011-05-11 15:59:44,148 ERROR hdfs.StateChange - BLOCK* > NameSystem.getDatanode: Data node 10.104.211.58:50010 is attempting to > report storage ID DS-1360904153-10.104.211.57-50010-1293288346692. Node > 10.104.211.57:50010 is expected to serve this storage. > 2011-05-11 15:59:46,975 ERROR hdfs.StateChange - BLOCK* > NameSystem.getDatanode: Data node 10.104.211.57:50010 is attempting to > report storage ID DS-1360904153-10.104.211.57-50010-1293288346692. Node > 10.104.211.58:50010 is expected to serve this storage. > > I understand that this is because the datanodes have the exact same > information so the first data node that connects has precedence. > > Is it possible to just wipe one of the datanodes so it is blank or do we > have to format the entire hdfs filesystem from the namenode to add the new > datanode. > > Thanks, > Steve Cohen >
Steve Cohen 2011-05-11, 22:32
Thanks, Jeff. deleting the contents of dfs.data.dir on the cloned data note worked.
On Wed, May 11, 2011 at 5:02 PM, Jeff Bean <[EMAIL PROTECTED]> wrote:
> If I understand correctly, datanode reports its blocks based on the > contents of dfs.data.dir. > > When you cloned the data node, you cloned all of its blocks as well. > > When you add a "fresh" datanode to the cluster, you add one that has an > empty dfs.data.dir. > > Try clearing out dfs.data.dir before adding the new node. > > Jeff > > > > On Wed, May 11, 2011 at 1:59 PM, Steve Cohen <[EMAIL PROTECTED]> wrote: > >> Hello, >> >> We are running an hdfs cluster and we decided we wanted to add a new >> datanode. Since we are using a virtual machine, we just cloned an existing >> datanode. We added it to the slaves list and started up the cluster. We >> started getting log messages like this in the namenode log: >> >> 2011-05-11 15:59:44,148 ERROR hdfs.StateChange - BLOCK* >> NameSystem.getDatanode: Data node 10.104.211.58:50010 is attempting to >> report storage ID DS-1360904153-10.104.211.57-50010-1293288346692. Node >> 10.104.211.57:50010 is expected to serve this storage. >> 2011-05-11 15:59:46,975 ERROR hdfs.StateChange - BLOCK* >> NameSystem.getDatanode: Data node 10.104.211.57:50010 is attempting to >> report storage ID DS-1360904153-10.104.211.57-50010-1293288346692. Node >> 10.104.211.58:50010 is expected to serve this storage. >> >> I understand that this is because the datanodes have the exact same >> information so the first data node that connects has precedence. >> >> Is it possible to just wipe one of the datanodes so it is blank or do we >> have to format the entire hdfs filesystem from the namenode to add the new >> datanode. >> >> Thanks, >> Steve Cohen >> > >
|
|