graham sanderson 2012-07-20, 23:49
Thanks Neha, I really meant, yes I may lose some messages in the meanwhile, but should I expect new messages after everything gets back to normal to be delivered (unless my code throws an exception and kills a worker thread, which it wasn't)
On Jul 20, 2012, at 6:00 PM, Neha Narkhede wrote:
> It really depends on what sort of network outage. The producer, whether zk
> or not, can be configured to retry couple of times. If it runs out of
> retries during this outage, it will drop the messages and they will be
> On Sat, Jul 14, 2012 at 11:32 AM, Jun Rao <[EMAIL PROTECTED]> wrote:
>> The pipeline is supposed to recover from the network outage. There could be
>> bugs, especially in the ZK-based producer since it's relatively new.
>> On Thu, Jul 12, 2012 at 7:08 PM, graham sanderson <[EMAIL PROTECTED]> wrote:
>>> Hi, so I happened to be going to demo a prototype built with kafka in a
>>> borrowed large room which I discovered had insufficient/flaky wireless.
>>> using zookeeper config, and getting lots of timeouts etc. Since this was
>>> the first time I had used kafka and I hadn't done any off path testing,
>>> first course of action was to find a hard wire, which I did and all the
>>> timeouts disappeared. The demo was great. Note that even with the flaky
>>> wireless, messages generally still seemed to be getting delivered, but
>>> always as far as I could tell (or perhaps with high latency - was more
>>> focused on having a working demo than debugging)
>>> I'm using 0.7 atm, though I'm not sure if that matters.
>>> My somewhat question is, given a simple scenario using kafka/zookeeper
>>> (prior to all the exciting fault tolerance work going on right now):
>>> 1) Lets say I have zookeeper server, kafka server, producer, and consumer
>>> running on a perfect network. And I successfully send a message from
>>> producer to consumer
>>> 2) All JVMs stay up, however I lose network connectivity between some or
>>> all of them for some time
>>> 3) The network becomes perfect again.
>>> 4) I wait for some time for everyone to reconnect/re-negociate to their
>>> best ability
>>> Following that, should I expect a new message from the producer to reach
>>> the consumer, or can the system get into a broken state?… I swear I saw
>>> such a message not delivered, but I can't say for sure… I can certainly
>>> investigate further by trying to reproduce again and wading thru the many
>>> logged errors, but if someone already knows the answer that'd be awesome!