Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Kafka >> mail # dev >> Re: Review Request 13908: Initial patch KAFKA-1012


+
Tejas Patil 2013-08-30, 23:26
Copy link to this message
-
Re: Review Request 13908: Initial patch KAFKA-1012

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13908/#review25904
-----------------------------------------------------------

core/src/main/scala/kafka/common/ErrorMapping.scala
<https://reviews.apache.org/r/13908/#comment50538>

    Could this error code be renamed to something like OffsetLoadingNotCompleteCode. Arguably this will convey the error code more clearly.

core/src/main/scala/kafka/consumer/ZookeeperConsumerConnector.scala
<https://reviews.apache.org/r/13908/#comment50544>

    It will be good to be specific about which channel the consumer failed to establish. In this case, let's mention "Unable to establish a channel for fetching offsets with any of the live brokers in %s".format(brokers.mkString(','))

core/src/main/scala/kafka/consumer/ZookeeperConsumerConnector.scala
<https://reviews.apache.org/r/13908/#comment50545>

    Is it a good idea for commitOffsets() to eat up every error that it encounters ? commitOffsets() is a public API and users want to use it to commit offsets on demand, manually. These users do not use auto commit offsets and use commitOffsets() to checkpoint offsets as often as the application logic dictates. For that use case, if the commitOffsets() has not actually successfully committed the offsets, the user of the API must know about it and retry as required. Thoughts?

core/src/main/scala/kafka/consumer/ZookeeperConsumerConnector.scala
<https://reviews.apache.org/r/13908/#comment50550>

    It is probably better to be clearer on this error message as well. Something along the lines of "as offset bootstrap is still in progress on some brokers. This means leadership changed recently for the offsets topic"

core/src/main/scala/kafka/server/KafkaServer.scala
<https://reviews.apache.org/r/13908/#comment50556>

    Curious - why do we need to use the singleton pattern here? Shouldn't only one thread invoke KafkaServer.startup?

core/src/main/scala/kafka/server/OffsetManager.scala
<https://reviews.apache.org/r/13908/#comment50557>

    this file has turned into a big blob of code. It will help if you can separate the OffsetManager trait, the DefaultOffsetManager and ZookeeperOffsetManager into separate files

core/src/main/scala/kafka/server/OffsetManager.scala
<https://reviews.apache.org/r/13908/#comment50552>

    I think it is best to not include any parameters to the startup() API as it is difficult to come up with a set of parameters that would work for all possible offset managers. What might work better is to include a generic init API that takes in a Properties object. This API initializes the context required for the offset manager. startup might or might not be useful if we add init(Properties), I'm not so sure.  

core/src/main/scala/kafka/server/OffsetManager.scala
<https://reviews.apache.org/r/13908/#comment50555>

    load the offsets from the logs is not generic enough. What if the offsets are stored in a database or custom flat files ?

core/src/main/scala/kafka/server/OffsetManager.scala
<https://reviews.apache.org/r/13908/#comment50554>

    Agree with Sriram that this could be named differently. It will also help if we describe the purpose of each of these APIs clearly. For example, if I want to store offsets in a database, how do I know why triggerLoadOffsets is required? Is it used to bootstrap some sort of offsets cache on startup ?
    
    Also try to describe when these APIs will be invoked on the Kafka server side. That will help the user implement a specific offset manager relatively easily

core/src/main/scala/kafka/server/OffsetManager.scala
<https://reviews.apache.org/r/13908/#comment50577>

    There seems to be a race condition that might overwrite a newer offset with a stale one. This can happen when a broker becomes a leader for some partition of the offsets topic. When this happens, partition.makeLeader() exposes the broker as the new leader. At that point, it can start taking in offset commit requests. An offset commit request can come in at the same time that triggerLoadOffsets() is being invoked for the same offsets partition. putOffset() will go through and update the offsets table with the new offset. It does not touch commitsWhileLoading since loading does not have the key in it. Then the 1st statement in triggerLoadOffsets is executed and loading gets the offsets partition added to it. It goes ahead and updates the offsets table with the old offset since commitsWhileLoading was not updated by putOffset.
- Neha Narkhede
On Aug. 30, 2013, 9:19 p.m., Tejas Patil wrote:
 
+
Tejas Patil 2013-09-05, 14:09