Have you tried the replication feature added to 0.8 http://kafka.apache.org/documentation.html#replication /******************************************* Joe Stein Founder, Principal Consultant Big Data Open Source Security LLC http://www.stealth.ly Twitter: @allthingshadoop ********************************************/ On Nov 28, 2013, at 9:37 AM, Demian Berjman <[EMAIL PROTECTED]> wrote:
What I mean by that is that your looking to have the Kafka cluster able to be down for like 5 minutes or upto a day. The problem is estimating how long it will take to recover.
Is this work your doing for a consulting project? Or are you doing something on behalf of an employer. Basically would like to know more about the use-case. You can email me directly at [EMAIL PROTECTED] so we don't clog the message board. On Thu, Nov 28, 2013 at 8:05 AM, Demian Berjman <[EMAIL PROTECTED]>wrote:
Steve, our use case is very simple. There are many reasons for a cluster to go down. If that the case, what we do with the producers? Hopefully it will be a time window of a couple of hours. If your concern are the queued messages, we have only a few thousands per day.
Thanks, On Thu, Nov 28, 2013 at 1:12 PM, Steve Morin <[EMAIL PROTECTED]> wrote:
I would say follow Sparkngin and you'll be able to use the "Log persistence if the log producer connection is down". functionality when it's complete. On Thu, Nov 28, 2013 at 8:36 AM, Demian Berjman <[EMAIL PROTECTED]>wrote:
We've done this at Sematext, where we use Kafka in all 3 products/services you see in my signature. When we fail to push a message into Kafka we store it in the FS and from there we can process it later.
Otis Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/ On Thu, Nov 28, 2013 at 9:37 AM, Demian Berjman <[EMAIL PROTECTED]>wrote:
In that case, if one is that concerned, why not run a single Kafka broker on the same machine, and connect to it over localhost? And disable ZK mode too, perhaps.
I may be missing something, but I never fully understand why people try really hard to build a stream-to-disk backup approach, when they might be able to couple tightly to Kafka, which, well, just streams to disk.
On Nov 28, 2013, at 3:58 PM, Otis Gospodnetic <[EMAIL PROTECTED]> wrote:
Sure, and the disk could go bad, the machine itself could fail.
My point is that my experience of Kafka 0.72 has been that it is very reliable. The only time I have seen it go down is when the disk underneath fills up. So if one is going to write all the code to stream to disk *efficiently* from in-process, one should trade off that cost versus connecting to a process over localhost, which has been shown to do a very good job of just that.
It's fair to ask what if the broker process goes down. But it's fair to ask what if there is a bug in the stream-to-disk code you write? What if your process goes down?
I am not saying I am right. Just that engineering is about trade-offs, and a Kafka instance running right there on the same machine, might provide the reliability required. But it might not. As always, YMMV.
Right, broker on localhost and localhost connection don't help if the broker is actually down. It's not only about network reachability and such. We write to FS (yes, file system) as "well, what is the simplest thing that we can do and where we are least likely to hit some other issues? Write to FS.". Yes, much like Kafka brokers themselves, but this helps when Kafka is down for some reason.
Otis Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/ On Thu, Nov 28, 2013 at 8:04 PM, Philip O'Toole <[EMAIL PROTECTED]> wrote:
NEW: Monitor These Apps!
All projects made searchable here are trademarks of the Apache Software Foundation.
Service operated by Sematext