Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Kafka Broker Configuration Tuning and Repartitioning topic


Copy link to this message
-
Re: Kafka Broker Configuration Tuning and Repartitioning topic
I think this may be a terminology issue. By "re-partitioning" I think Neha
means taking data currently on disk and splitting it into a different
number of partitions on different servers. We can't really do this because
the partition function is something computed on the client.

A different issue is migrating partitions to different servers, that will
be supported. This is the standard kind of over-partitioning setup you
would expect in many distributed system (i.e. you create up front a fixed
number of partitions which doesn't change, but you can move them around).

Another issue is changing the total number of partitions for a topic. This
will eventually be supported, though maybe not in 0.8 iiuc. You would do
this if you wanted more parallelism in the topic. Even though we wouldn't
go back and retrofit data into the new partitions, that is probably fine as
data would naturally cycle out as it falls out of the retention period.

-Jay
On Tue, Nov 20, 2012 at 11:17 AM, Muthukumar <[EMAIL PROTECTED]> wrote:

> Hi Neha,
>
> Thanks for the response, and we're currently working to integrate with
> mbeans exposed with collectors and monitor it.
>
> It will be great to know if we've not having support of repartition,
> can we move the files in one partition to another to pick-up? Will
> that work.
>
> Noads-8:
> total 9886868
> -rw-r--r--. 1 root root 536871644 Nov 18 20:38 00000000037581033857.kafka
> -rw-r--r--. 1 root root 536871327 Nov 18 22:01 00000000038117905501.kafka
> -rw-r--r--. 1 root root 536871525 Nov 18 23:22 00000000038654776828.kafka
> -rw-r--r--. 1 root root 536871520 Nov 19 00:40 00000000039191648353.kafka
> -rw-r--r--. 1 root root 536871057 Nov 19 01:56 00000000039728519873.kafka
>
> Noads-9:
> total 9891380
> -rw-r--r--. 1 root root 536871893 Nov 18 20:37 00000000037581035640.kafka
> -rw-r--r--. 1 root root 536871274 Nov 18 22:00 00000000038117907533.kafka
> -rw-r--r--. 1 root root 536872062 Nov 18 23:21 00000000038654778807.kafka
> -rw-r--r--. 1 root root 536872190 Nov 19 00:40 00000000039191650869.kafka
>
> If we move one of the in partition#9 of Noads to partition#8, will
> that work. Thanks.
>
> -Muthu
>
> On Tue, Nov 20, 2012 at 9:51 PM, Neha Narkhede <[EMAIL PROTECTED]>
> wrote:
> > Muthu,
> >
> > a) Not as of now. Please feel free to create the JIRA and specify the
> > details there
> >
> > b) I doubt increasing partitions will help. 500 GB/day/topic suggests
> > the data per partition is only 10 GB/day. Before thinking about
> > increasing the # of partitions, I would try a few things-
> >
> > 1. Inspect the consumer throughput metrics through the mbeans exposed
> > on the Kafka consumers.
> > 2. If individual consumer throughput looks reasonable, then deploy
> > more consumer instances and see if that helps. Since you have 40-50
> > partitions per topic, you can have at least those many consumer
> > instances.
> > 3. If not, then check if the consumers post-process the data consumed
> > from these partitions. If this processing is slow, your consumption
> > rate will reduce.
> >
> > Thanks,
> > Neha
> >
> > On Tue, Nov 20, 2012 at 3:12 AM, Muthukumar <[EMAIL PROTECTED]> wrote:
> >> Hi Jun,
> >>
> >> Thanks for the response.
> >>
> >> a) Is there any plan in the roadmap to address this re-partition or
> >> partition balance with new partitions? Please let me know to have the
> >> JIRA for this.
> >>
> >> b) Do we need to go for more partitions for the topic6 (46 to ??) to
> >> reduce the new requests + backlog.
> >>
> >> -Muthu
> >>
> >> On Tue, Nov 20, 2012 at 11:09 AM, Jun Rao <[EMAIL PROTECTED]> wrote:
> >>> The configs look reasonable. Currently, we don't repartition existing
> data.
> >>> Only new messages will consider the newly added partitions.
> >>>
> >>> Thanks
> >>>
> >>> Jun
> >>>
>
>
>
> --
> Mail: [EMAIL PROTECTED] / [EMAIL PROTECTED] | Phone:
> +91-94436-62936 (Chennai) / +91-96207-89253 (Bangalore)
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB