Kafka, mail # dev - Re: Random Partitioning Issue - 2013-09-17, 17:19
 Search Hadoop and all its subprojects:

Switch to Threaded View
Copy link to this message
-
Re: Random Partitioning Issue
I agree that minimizing the number of producer connections (while
being a good thing) is really required in very large production
deployments, and the net-effect of the existing change is
counter-intuitive to users who expect an immediate even distribution
across _all_ partitions of the topic.

However, I don't think it is a hack because it is almost exactly the
same behavior as 0.7 in one of its modes. The 0.7 producer (which I
think was even more confusing) had three modes:
i) ZK send
ii) Config send(a): static list of broker1:port1,broker2:port2,etc.
iii) Config send(b): static list of a hardwareVIP:VIPport

(i) and (ii) would achieve even distribution. (iii) would effectively
select one broker and distribute to partitions on that broker within
each reconnect interval. (iii) is very similar to what we now do in
0.8. (Although we stick to one partition during each metadata refresh
interval that can be changed to stick to one broker and distribute
across partitions on that broker).

At the same time, I agree with Joe's suggestion that we should keep
the more intuitive pre-KAFKA-1017 behavior as the default and move the
change in KAFKA-1017 to a more specific partitioner implementation.

Joel
On Sun, Sep 15, 2013 at 8:44 AM, Jay Kreps <[EMAIL PROTECTED]> wrote:

 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB