Yes, you can. You just need to copy data in log.dir on disk to the new machine and keep the broker.id in broker config the same. No need to change anything in ZK since broker will re-register on startup. The main purpose of broker.id is to allow people to move data logically from 1 broker to another.
On Wed, Feb 27, 2013 at 2:11 AM, Jason Huang <[EMAIL PROTECTED]> wrote:
As Jun described, the purpose of broker.id is to be able to move data from one broker to the other without changes. I believe this should work in 0.8 as well. However, we've never tried it so not sure if there are bugs. Let us know how it goes.
Thanks, Neha On Wed, Feb 27, 2013 at 1:31 PM, Jason Huang <[EMAIL PROTECTED]> wrote:
I've started by only coping $log.dir from server A to server B. Both server A and server B ran same version of kafka 0.8 with same configuration files.
However, after running kafka 0.8 on server B I get the following exception when I tried to fetch the message: 2013-02-28 05:56:35,851] WARN [KafkaApi-1] Error while responding to offset request (kafka.server.KafkaApis) kafka.common.UnknownTopicOrPartitionException: Topic topic_general partition 0 doesn't exist on 1 at kafka.server.ReplicaManager.getLeaderReplicaIfLocal(ReplicaManager.scala:163)........
However, the folder topic_general-0 exists and there are files 00000000000000000000.log and 00000000000000000000.index there . There are also a replication-offset-checkpoint file in this $log.dir folder. I then copied by $log.dir and also the zookeeper folder from server A to server B and run it. In the zookeeper folder I have the following files: -rw-r--r--. 1 root root 296 Feb 28 06:12 snapshot.0 -rw-r--r--. 1 root root 67108880 Feb 28 06:12 log.1 -rw-r--r--. 1 root root 67108880 Feb 28 06:12 log.4b -rw-r--r--. 1 root root 4817 Feb 28 06:12 snapshot.4a
With both log data and zookeeper data copied over to server B I am getting start up errors in zookeeper log INFO Got user-level KeeperException when processing sessionid:0x13d20830d3e0000 type:create cxid:0x1 zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a Error Path:/brokers/ids/1 Error:KeeperErrorCode = NodeExists for /brokers/ids/1 (org.apache.zookeeper.server.PrepRequestProcessor)
and start up errors in kafka server log: [2013-02-28 06:15:00,228] ERROR [Partition state machine on Controller 1]: State change for partition [topic_84784ecc-3803-42eb-bcdd-31dc42b697c6, 0] from OfflinePartition to OnlinePartition failed (kafka.controller.PartitionStateMachine) kafka.common.PartitionOfflineException: All replicas for partition [topic_84784ecc-3803-42eb-bcdd-31dc42b697c6, 0] are dead. Marking this partition offline
And I am getting the same error when trying to fetch messages: 2013-02-28 06:20:01,516] WARN [KafkaApi-1] Error while responding to offset request (kafka.server.KafkaApis) kafka.common.UnknownTopicOrPartitionException: Topic topic_general partition 0 doesn't exist on 1 at kafka.server.ReplicaManager.getLeaderReplicaIfLocal(ReplicaManager.scala:163)
I am running both zookeeper and kafka on the same server. I only have one server so the replication factor is 1.
Looks like something went wrong for me. Any ideas?
On Wed, Feb 27, 2013 at 6:59 PM, Neha Narkhede <[EMAIL PROTECTED]> wrote:
I actually tried to load the data back with the same instance of kafka on server A so the broker id must be the same. The reason I brought this up at the first place is because we've had some issues recognizing the messages on a server stop/restart. I was able to reproduce our issue with following steps:
(3) stop server sudo /opt/kafka-0.8.0/bin/kafka-server-stop.sh sudo /opt/kafka-0.8.0/bin/zookeeper-server-start.sh
Notice that kafka-server-stop.sh uses kill -SIGTERM and zookeeper-server-start.sh uses kill -SIGINT. My observation is that on our server kill -SIGINT doesn't actually kill the zookeeper process. (I can still see that running when I check the processes).
Start from this state (running kill -SIGTERM for kafka server and kill -SIGINT for zookeeper server), we restart the zookeeper and kafka services: nohup sudo /opt/kafka-0.8.0/bin/zookeeper-server-start.sh /opt/kafka-0.8.0/config/zookeeper.properties > /opt/kafka-0.8.0/data/kafka-logs/zook.out 2>&1 & nohup sudo /opt/kafka-0.8.0/kafka-server-start.sh /opt/kafka-0.8.0/config/server.properties > /opt/kafka-0.8.0/data/kafka-logs/kafka.out 2>&1 &
Then when we tried to fetch the messages from existing topics and partitions, we get the following error: WARN [KafkaApi-1] Error while responding to offset request (kafka.server.KafkaApis) kafka.common.UnknownTopicOrPartitionException: Topic topic_general partition 0 doesn't exist on 1 at kafka.server.ReplicaManager.getLeaderReplicaIfLocal(ReplicaManager.scala:163)
I am not sure if anyone has experienced this before. It appears to me that because kill -SIGINT didn't actually kill the previous zookeeper process, running from that state messes up the partition/topic information with zookeeper? And maybe because of that, copying the log files and trying to reload them won't work (because somehow information were corrupted)?
On Thu, Feb 28, 2013 at 12:10 PM, Neha Narkhede <[EMAIL PROTECTED]> wrote:
Sure I can collect the logs. However, the strange thing in my case is that the zookeeper-server-stop.sh script (kill -SIGINT) didn't actually kill the zookeeper process in my server. When you tried shutting down zookeeper in your step, did you double check to see if the zookeeper process had been killed or not? (ps aux | grep "zookeeper")
On Fri, Mar 1, 2013 at 5:40 PM, Neha Narkhede <[EMAIL PROTECTED]> wrote:
NEW: Monitor These Apps!
Apache Lucene, Apache Solr and all other Apache Software Foundation project and their respective logos are trademarks of the Apache Software Foundation.
Elasticsearch, Kibana, Logstash, and Beats are trademarks of Elasticsearch BV, registered in the U.S. and in other countries. This site and Sematext Group is in no way affiliated with Elasticsearch BV.
Service operated by Sematext