The tradeoff is there:
Pro: more partitions means more consumer parallelism. The total
threads/processes across all consumer machines can't exceed the consumer
Con: more partitions mean more file descriptors and hence smaller writes to
each file (so more random io).
Our setting is fairly random. The ideal number would be the smallest number
that satisfies your forceable need for consumer parallelism.
On Thu, Aug 15, 2013 at 3:41 PM, Vadim Keylis <[EMAIL PROTECTED]> wrote: