Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Zookeeper


Copy link to this message
-
Re: Zookeeper
Tim,

Would you mind explaining how you use Kafka? Basically the general
overview of the messages/events you are capturing and how you go about
processing them. We will also be using kafka-rb so I'm particularly
interested in how others are using it.

- M

On 11/5/11 11:49 PM, Tim Lossen wrote:
> we are using kafka entirely without zookeeper, and it is working
> fine so far: single kafka broker, ruby consumers without coordination.
>
> tim
>
>
> On 2011-11-05, at 22:03 , Mark wrote:
>
>> Ok, so no matter what ZooKeeper is still required when using Kafka. One just has the option to either loadbalance producer =>  broker connections via ZooKeeper or a Loadbalancer.
>>
>> Is that correct? If so, I think I finally got it :)
>>
>> On 11/5/11 1:29 PM, Jay Kreps wrote:
>>> It is also worth mentioning that this is just for producers, consumers
>>> always use zookeeper for load balancing and co-ordination. Logically this
>>> makes sense--partitioning production is trivial if you don't care about
>>> semantics of key=>partition assignment, but partitioning consumption is
>>> more complex because you need to divide up the partitions amongst the set
>>> of all consumers exactly.
>>>
>>> -jay
>>>
>>> On Sat, Nov 5, 2011 at 1:19 PM, Jay Kreps<[EMAIL PROTECTED]>   wrote:
>>>
>>>> The motivation here is is that literally every production process at
>>>> LinkedIn sends messages to Kafka as part of either user tracking or
>>>> operational monitoring or both. We are wary of adding that many zk
>>>> connections and watches, so we run this first tier through a simple L2 load
>>>> balancer that just randomly balances connections over brokers. The good
>>>> part about this is that we can do zookeeper upgrades without redeploying
>>>> all the production apps to upgrade their zk jar.
>>>>
>>>> As Neha says, the zk producer is used for key-based partitioning by the
>>>> smaller number of producers who need that.
>>>>
>>>> -Jay
>>>>
>>>>
>>>> On Sat, Nov 5, 2011 at 11:56 AM, Neha Narkhede<[EMAIL PROTECTED]>wrote:
>>>>
>>>>> Mark,
>>>>>
>>>>> Most publishers at LinkedIn use a hardware load balancer approach.
>>>>> These are configured to do a TCP healthcheck that monitors if the
>>>>> kafka port on a broker is working. If it is, then requests are
>>>>> forwarded to the broker. Some publishers though are using the software
>>>>> load balancer based on zookeeper. Those applications want to do some
>>>>> key based partitioning of data.
>>>>>
>>>>> Thanks,
>>>>> Neha
>>>>>
>>>>> On Sat, Nov 5, 2011 at 11:49 AM, Mark<[EMAIL PROTECTED]>   wrote:
>>>>>> Sorry but I'm a bit confused now. So at LinkedIn you use a loadbalancer
>>>>>> instead of ZooKeeper or do you use it in conjunction with ZooKeeper?
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>> On 11/4/11 7:09 PM, Jun Rao wrote:
>>>>>>> broker.list is used in the producer property file. One caveat is that
>>>>> the
>>>>>>> broker.list approach doesn't do healthcheck. Which means that if a
>>>>> broker
>>>>>>> goes down, the client could still try to send messages to it. At
>>>>> LinkedIn,
>>>>>>> we rely on a load balancer to do healthcheck for us. The zk-based
>>>>>>> producer,
>>>>>>> on the other hand, does health check.
>>>>>>>
>>>>>>> You can find out more details about our ZK design in our design page in
>>>>>>> the
>>>>>>> website or the paper in
>>>>>>>
>>>>>>>
>>>>> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+papers+and+presentations
>>>>> .
>>>>>>> Jun
>>>>>>>
>>>>>>> On Fri, Nov 4, 2011 at 6:52 PM, Mark<[EMAIL PROTECTED]>
>>>>>   wrote:
>>>>>>>> I just noticed that there is an option to not use Zookeeper and
>>>>> instead
>>>>>>>> one can use a static list of brokers (#9 on
>>>>>>>> http://incubator.apache.org/**
>>>>>>>>
>>>>>>>> kafka/quickstart.html<
>>>>> http://incubator.apache.org/kafka/quickstart.html>).
>>>>>>>> Do i put this list in server.properties?
>>>>>>>>
>>>>>>>> It doesn't seem like you save much either way as you have to either
>>>>>>>>   a) list out all the nodes in the zookeeper quorum in
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB