Depending on how much data there is in those partitions it can take a while for reassignment to actually complete. You will need to use the determine whether partition reassignment has completed or not.
Joel On Tue, Oct 15, 2013 at 3:46 PM, Kane Kane <[EMAIL PROTECTED]> wrote:
For a leader change yes, but this is partition reassignment which completes when all the reassigned replicas are in sync with the original replica(s). You can check the status of the command using the option I mentioned earlier.
On Tue, Oct 15, 2013 at 7:02 PM, Kane Kane <[EMAIL PROTECTED]> wrote:
Oh i see, what is the better way to initiate the leader change? As I told somehow all my partitions have the same leader for some reason. I have 3 brokers and all partitions have leader on single one. On Wed, Oct 16, 2013 at 12:04 AM, Joel Koshy <[EMAIL PROTECTED]> wrote:
Yes, thanks, looks like that's what i need, do you know why it tends to choose the leader for all partitions on the single broker, despite I have 3? On Wed, Oct 16, 2013 at 12:19 AM, Joel Koshy <[EMAIL PROTECTED]> wrote:
I'm in the process of reassigning partitions away from failing machines and it appears to be stuck. One thought is because our machines are failing at a very high rate and so some partitions no longer have any live replicas at all. At this point I don't care about the data, I just want to get all partitions onto the set of machines that I know work. Is there some way I can do this? I am happy to manipulate ZooKeeper and bounce nodes if need be.
And a warning... this is due to Amazon EC2 d2 instance type failures. We spun up 9 d2.xlarge instances and within a few hours 6 have failed under a Kafka workload. So yeah, bleeding edge.
One thing I've done is rebuilt one of these nodes with the same broker id and name but under a known working instance type. It came up and now is spewing this in the logs:
[2015-04-03 13:05:30,275] 805497 [kafka-request-handler-0] WARN kafka.server.KafkaApis - [KafkaApi-29] Produce request with correlation id 5849 from client ping_partitioner on partition [pings,245] failed due to Topic pings either doesn't exist or is in the process of being deleted
The topic most certainly should exist, however I'm guessing it's complaining because there are no live replicas for that partition. Is there some way to get it to just become leader?
NEW: Monitor These Apps!
Apache Lucene, Apache Solr and all other Apache Software Foundation projects and their respective logos are trademarks of the Apache Software Foundation.
Elasticsearch, Kibana, Logstash, and Beats are trademarks of Elasticsearch BV, registered in the U.S. and in other countries. This site and Sematext Group is in no way affiliated with Elasticsearch BV.
Service operated by Sematext