Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Balancing a cluster when a new node is added


Copy link to this message
-
Re: Balancing a cluster when a new node is added
Hi,
Yes, the config files are the same. I checked the namenode log
for eac of the 5 pre-existing nodes I get something like

2010-01-10 12:32:33,921 INFO org.apache.hadoop.hdfs.StateChange:
BLOCK* NameSystem.registerDatanode: node registration from
X.Y.Z.D:50010 storage DS-1908504044-127.0.0.1-50010-1263057662169

but not for the newly added node.
I just added the machine to the slaves file and restarted the cluster.
Is there something else I should do to the new node?

Regards
Saptarshi

On Sun, Jan 10, 2010 at 4:11 AM, Eli Collins <[EMAIL PROTECTED]> wrote:
> Have you verified this new DNs Hadoop configuration files are the same
> as the others? Do you see any errors in the NN when restarting HDFS on
> this new node?
>
> Thanks,
> Eli
>
> On Sat, Jan 9, 2010 at 9:44 AM, Saptarshi Guha <[EMAIL PROTECTED]> wrote:
>> Hello,
>> I'm using Hadoop 0.20.1. I just added a new node to a 5 node
>> cluster(for a total of 6), there is already about 500GB across 5
>> nodes.
>> In order to distributed the data across the entire cluster (including
>> the new node) I ran
>>
>> hadoop balancer
>> Time Stamp               Iteration#  Bytes Already Moved  Bytes Left
>> To Move  Bytes Being Moved
>> The cluster is balanced. Exiting...
>> Balancing took 356.0 milliseconds
>>
>> Clearly the cluster is not balanced, but how do I force it to be so?
>>
>> Q2. On the DFS UI website, when I click on the existing nodes to see
>> data, I can, but when I click on the new node, i can't connect.
>> Does this happen when there are no files? The datanode log for this
>> machine does not show any errors. I have managed to copy a small file
>> this new machine (from the new machine, so the file is stored on this
>> machines section of the DFS)
>>
>>
>> 2010-01-09 12:20:57,681 INFO org.apache.hadoop.http.HttpServer:
>> listener.getLocalPort() returned 50075
>> webServer.getConnectors()[0].getLocalPort() returned 50075
>> 2010-01-09 12:20:57,681 INFO org.apache.hadoop.http.HttpServer: Jetty
>> bound to port 50075
>> 2010-01-09 12:20:57,681 INFO org.mortbay.log: jetty-6.1.14
>> 2010-01-09 12:21:02,148 INFO org.mortbay.log: Started
>> SelectChannelConnector@0.0.0.0:50075
>> 2010-01-09 12:21:02,152 INFO org.apache.hadoop.metrics.jvm.JvmMetrics:
>> Initializing JVM Metrics with processName=DataNode, sessionId=null
>> 2010-01-09 12:21:02,165 INFO org.apache.hadoop.ipc.metrics.RpcMetrics:
>> Initializing RPC Metrics with hostName=DataNode, port=50020
>> 2010-01-09 12:21:02,167 INFO org.apache.hadoop.ipc.Server: IPC Server
>> Responder: starting
>> 2010-01-09 12:21:02,168 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 0 on 50020: starting
>> 2010-01-09 12:21:02,168 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 1 on 50020: starting
>> 2010-01-09 12:21:02,168 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode: dnRegistration >> DatanodeRegistration(altair.stat.purdue.edu:50010, storageID=,
>> infoPort=50075, ipcPort=50020)
>> 2010-01-09 12:21:02,169 INFO org.apache.hadoop.ipc.Server: IPC Server
>> listener on 50020: starting
>> 2010-01-09 12:21:02,170 INFO org.apache.hadoop.ipc.Server: IPC Server
>> handler 2 on 50020: starting
>> 2010-01-09 12:21:02,173 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode: New storage id
>> DS-1908504044-127.0.0.1-50010-1263057662169 is assigned to data-node
>> 128.210.141.105:50010
>> 2010-01-09 12:21:02,173 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode:
>> DatanodeRegistration(X.X.X.X:50010,
>> storageID=DS-1908504044-127.0.0.1-50010-1263057662169, infoPort=50075,
>> ipcPort=50020)In DataNode.run, data >> FSDataset{dirpath='/ln/meraki/hdfs/dfs/data/current'}
>> 2010-01-09 12:21:02,173 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode: using
>> BLOCKREPORT_INTERVAL of 3600000msec Initial delay: 0msec
>> 2010-01-09 12:21:02,187 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode: BlockReport of 0
>> blocks got processed in 2 msecs
>> 2010-01-09 12:21:02,188 INFO
>> org.apache.hadoop.hdfs.server.datanode.DataNode: Starting Periodic
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB