Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Zookeeper

Copy link to this message
Re: Zookeeper

Would you mind explaining how you use Kafka? Basically the general
overview of the messages/events you are capturing and how you go about
processing them. We will also be using kafka-rb so I'm particularly
interested in how others are using it.

- M

On 11/5/11 11:49 PM, Tim Lossen wrote:
> we are using kafka entirely without zookeeper, and it is working
> fine so far: single kafka broker, ruby consumers without coordination.
> tim
> On 2011-11-05, at 22:03 , Mark wrote:
>> Ok, so no matter what ZooKeeper is still required when using Kafka. One just has the option to either loadbalance producer =>  broker connections via ZooKeeper or a Loadbalancer.
>> Is that correct? If so, I think I finally got it :)
>> On 11/5/11 1:29 PM, Jay Kreps wrote:
>>> It is also worth mentioning that this is just for producers, consumers
>>> always use zookeeper for load balancing and co-ordination. Logically this
>>> makes sense--partitioning production is trivial if you don't care about
>>> semantics of key=>partition assignment, but partitioning consumption is
>>> more complex because you need to divide up the partitions amongst the set
>>> of all consumers exactly.
>>> -jay
>>> On Sat, Nov 5, 2011 at 1:19 PM, Jay Kreps<[EMAIL PROTECTED]>   wrote:
>>>> The motivation here is is that literally every production process at
>>>> LinkedIn sends messages to Kafka as part of either user tracking or
>>>> operational monitoring or both. We are wary of adding that many zk
>>>> connections and watches, so we run this first tier through a simple L2 load
>>>> balancer that just randomly balances connections over brokers. The good
>>>> part about this is that we can do zookeeper upgrades without redeploying
>>>> all the production apps to upgrade their zk jar.
>>>> As Neha says, the zk producer is used for key-based partitioning by the
>>>> smaller number of producers who need that.
>>>> -Jay
>>>> On Sat, Nov 5, 2011 at 11:56 AM, Neha Narkhede<[EMAIL PROTECTED]>wrote:
>>>>> Mark,
>>>>> Most publishers at LinkedIn use a hardware load balancer approach.
>>>>> These are configured to do a TCP healthcheck that monitors if the
>>>>> kafka port on a broker is working. If it is, then requests are
>>>>> forwarded to the broker. Some publishers though are using the software
>>>>> load balancer based on zookeeper. Those applications want to do some
>>>>> key based partitioning of data.
>>>>> Thanks,
>>>>> Neha
>>>>> On Sat, Nov 5, 2011 at 11:49 AM, Mark<[EMAIL PROTECTED]>   wrote:
>>>>>> Sorry but I'm a bit confused now. So at LinkedIn you use a loadbalancer
>>>>>> instead of ZooKeeper or do you use it in conjunction with ZooKeeper?
>>>>>> Thanks
>>>>>> On 11/4/11 7:09 PM, Jun Rao wrote:
>>>>>>> broker.list is used in the producer property file. One caveat is that
>>>>> the
>>>>>>> broker.list approach doesn't do healthcheck. Which means that if a
>>>>> broker
>>>>>>> goes down, the client could still try to send messages to it. At
>>>>> LinkedIn,
>>>>>>> we rely on a load balancer to do healthcheck for us. The zk-based
>>>>>>> producer,
>>>>>>> on the other hand, does health check.
>>>>>>> You can find out more details about our ZK design in our design page in
>>>>>>> the
>>>>>>> website or the paper in
>>>>> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+papers+and+presentations
>>>>> .
>>>>>>> Jun
>>>>>>> On Fri, Nov 4, 2011 at 6:52 PM, Mark<[EMAIL PROTECTED]>
>>>>>   wrote:
>>>>>>>> I just noticed that there is an option to not use Zookeeper and
>>>>> instead
>>>>>>>> one can use a static list of brokers (#9 on
>>>>>>>> http://incubator.apache.org/**
>>>>>>>> kafka/quickstart.html<
>>>>> http://incubator.apache.org/kafka/quickstart.html>).
>>>>>>>> Do i put this list in server.properties?
>>>>>>>> It doesn't seem like you save much either way as you have to either
>>>>>>>>   a) list out all the nodes in the zookeeper quorum in