Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # dev >> Random Partitioning Issue


Copy link to this message
-
Re: Random Partitioning Issue
Let me ask another question which I think is more objective. Let's say 100
random, smart infrastructure specialists try Kafka, of these 100 how many
do you believe will
1. Say that this behavior is what they expected to happen?
2. Be happy with this behavior?
I am not being facetious I am genuinely looking for a numerical estimate. I
am trying to figure out if nobody thought about this or if my estimate is
just really different. For what it is worth my estimate is 0 and 5
respectively.

This would be fine expect that we changed it from the good behavior to the
bad behavior to fix an issue that probably only we have.

-Jay
On Sun, Sep 15, 2013 at 8:37 AM, Jay Kreps <[EMAIL PROTECTED]> wrote:

> I just took a look at this change. I agree with Joe, not to put to fine a
> point on it, but this is a confusing hack.
>
> Jun, I don't think wanting to minimizing the number of TCP connections is
> going to be a very common need for people with less than 10k producers. I
> also don't think people are going to get very good load balancing out of
> this because most people don't have a ton of producers. I think instead we
> will spend the next year explaining this behavior which 99% of people will
> think is a bug (because it is crazy, non-intuitive, and breaks their usage).
>
> Why was this done by adding special default behavior in the null key case
> instead of as a partitioner? The argument that the partitioner interface
> doesn't have sufficient information to choose a partition is not a good
> argument for hacking in changes to the default, it is an argument for *
> improving* the partitioner interface.
>
> The whole point of a partitioner interface is to make it possible to plug
> in non-standard behavior like this, right?
>
> -Jay
>
>
> On Sat, Sep 14, 2013 at 8:15 PM, Jun Rao <[EMAIL PROTECTED]> wrote:
>
>> Joe,
>>
>> Thanks for bringing this up. I want to clarify this a bit.
>>
>> 1. Currently, the producer side logic is that if the partitioning key is
>> not provided (i.e., it is null), the partitioner won't be called. We did
>> that because we want to select a random and "available" partition to send
>> messages so that if some partitions are temporarily unavailable (because
>> of
>> broker failures), messages can still be sent to other partitions. Doing
>> this in the partitioner is difficult since the partitioner doesn't know
>> which partitions are currently available (the DefaultEventHandler does).
>>
>> 2. As Joel said, the common use case in production is that there are many
>> more producers than #partitions in a topic. In this case, sticking to a
>> partition for a few minutes is not going to cause too much imbalance in
>> the
>> partitions and has the benefit of reducing the # of socket connections. My
>> feeling is that this will benefit most production users. In fact, if one
>> uses a hardware load balancer for producing data in 0.7, it behaves in
>> exactly the same way (a producer will stick to a broker until the
>> reconnect
>> interval is reached).
>>
>> 3. It is true that If one is testing a topic with more than one partition
>> (which is not the default value), this behavior can be a bit weird.
>> However, I think it can be mitigated by running multiple test producer
>> instances.
>>
>> 4. Someone reported in the mailing list that all data shows in only one
>> partition after a few weeks. This is clearly not the expected behavior. We
>> can take a closer look to see if this is real issue.
>>
>> Do you think these address your concerns?
>>
>> Thanks,
>>
>> Jun
>>
>>
>>
>> On Sat, Sep 14, 2013 at 11:18 AM, Joe Stein <[EMAIL PROTECTED]> wrote:
>>
>> > How about creating a new class called RandomRefreshPartioner and copy
>> the
>> > DefaultPartitioner code to it and then revert the DefaultPartitioner
>> code.
>> >  I appreciate this is a one time burden for folks using the existing
>> > 0.8-beta1 bumping into KAFKA-1017 in production having to switch to the
>> > RandomRefreshPartioner and when folks deploy to production will have to