-Re: Tuning mirror maker performance
Jun Rao 2013-08-23, 14:58
The bottleneck can be either CPU, network, or disk I/O. You just need to
monitor the load on each. For example, if you monitor the per thread level
CPU load, in MM you can figure out if there is single thread that's the
bottleneck. Then you can look at the I/O load on the target broker and see
if I/O is saturated. If not, increasing the batch size in the producer will
Refreshing metadata is only for existing topics. The producer always
refreshes metadata on new topics that it never sees.
On Fri, Aug 23, 2013 at 7:08 AM, Rajasekar Elango <[EMAIL PROTECTED]>wrote:
> Thanks Jun,
> What trouble shooting steps can we do to identify if bottleneck is with
> consuming or producing..? Does changing anything in log4j configuration or
> a jmx mbeans provide insight into it..? Does Metadata refresh interval
> affect picking up new partitions for only existing topic or it affect
> picking up any new topics..?
> ---------- Forwarded message ----------
> From: Jun Rao <[EMAIL PROTECTED]>
> Date: Fri, Aug 23, 2013 at 12:08 AM
> Subject: Re: Tuning mirror maker performance
> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
> You have to determine whether the bottleneck is in the consumer or the
> To improve the performance of the latter, you can increase the # of total
> consumer streams. # streams is capped by total # partitions. So, you may
> need to increase the # of partitions.
> To improve the performance of the latter, you can (a) increase the batch
> size in async mode and/or (b) run more instances of producers.
> Metadata refresh interval is configurable. It's mainly for the producer to
> pick up newly available partitions.
> On Thu, Aug 22, 2013 at 1:44 PM, Rajasekar Elango <[EMAIL PROTECTED]
> > I am trying to tune mirrormaker configurations based on this doc
> > <
> > >
> > and
> > would like know your recommendations.
> > Our configuration: We are doing inter datacenter replication with 5
> > in source and destination DC and 2 mirrormakers doing replication. We
> > about 4 topics with 4 partitions each.
> > I have been consumerOffsetChecker to analysis lag based on tuning.
> > 1. num.streams : - We have set num.streams=2 so that 4 partitions will
> > be shared between 2 mirrormaker. Increasing num.streams more than this
> > did
> > not improve any performance, is this correct?
> > 2. num.producers:- We initially set num.producers = 4 (assuming one
> > producer thread per topic), then we bumped num.producers = 16, but did
> > not
> > see any improvement in performance..? Is this correct..? How do we
> > determine optimum value for num.producers ?
> > 3. *socket.buffersize : *We initially had default values for these,
> > I changed socket.send.buffer.bytes on source broker,
> > socket.receive.buffer.bytes, fetch.message.max.bytes on mirrormaker
> > consumer properties, socket.receive.buffer.bytes,
> > socket.request.max.bytes on destination broker all to
> > 1024*1024*1024(1073741824) . This did improve the performance, but I
> > could
> > not get Lag to < 100.
> > Here is how our lag looks like after above changes:
> > Group Topic Pid Offset
> > logSize Lag Owner
> > mirrormakerProd FunnelProto 0 554704539
> > 554717088 12549
> > mirrormakerProd_ops-mmrs1-1-asg.ops.sfdc.net-1377192412490-38a53dc9-0
> > mirrormakerProd FunnelProto 1 547370573
> > 547383136 12563
> > mirrormakerProd_ops-mmrs1-1-asg.ops.sfdc.net-1377192412490-38a53dc9-1
> > mirrormakerProd FunnelProto 2 553124930
> > 553125742 812
> > mirrormakerProd_ops-mmrs1-2-asg.ops.sfdc.net-1377193322178-7262ed87-0