Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Kafka >> mail # user >> Mirrormaker stopped consuming


Copy link to this message
-
Re: Mirrormaker stopped consuming
16 GB is a very large heap. GC tuning becomes trickier as the size of the
heap increases. Are you sure you need that much memory to operate the
mirror maker? For us, the following GC settings have worked well -
https://cwiki.apache.org/confluence/display/KAFKA/Operations#Operations-Java

Thanks,
Neha
On Tue, Sep 3, 2013 at 10:40 AM, Rajasekar Elango <[EMAIL PROTECTED]>wrote:

> Thanks Neha,
>
> I did not take a thread dump before restarting, will get it when it happens
> again. We are using 16 Gigs of jvm heap. Do you have a recommendation on
> jvm GC options.?
>
> Thanks,
> Raja.
>
>
> On Tue, Sep 3, 2013 at 12:26 PM, Neha Narkhede <[EMAIL PROTECTED]
> >wrote:
>
> > 2013-09-01 05:59:27,792 [main-EventThread] INFO
> >  (org.I0Itec.zkclient.ZkClient)  - zookeeper state changed (Disconnected)
> > 2013-09-01 05:59:27,692 [main-SendThread(
> > mandm-zookeeper-asg.data.sfdc.net:2181)] INFO
> >  (org.apache.zookeeper.
> > ClientCnxn)  - Client session timed out, have not
> > heard from server in 4002ms for sessionid 0x140c603da5b0032, closing
> socket
> > connection and attempting reconnect
> >
> > This indicates that your mirror maker and/or your zookeeper cluster is
> > GCing for long periods of time. I have observed that if "client session
> > timed out" happens too many times, the client tends to lose zookeeper
> > watches. This is a potential bug in zookeeper. If this happens, your
> mirror
> > maker instance might not rebalance correctly and will start losing data.
> >
> > You mentioned consumption/production stopped on your mirror maker, could
> > you please take a thread dump and point us to it? Meanwhile, you might
> want
> > to fix the GC pauses.
> >
> > Thanks,
> > Neha
> >
> >
> > On Tue, Sep 3, 2013 at 8:59 AM, Rajasekar Elango <[EMAIL PROTECTED]
> > >wrote:
> >
> > > We found that mirrormaker stopped consuming and producing over the week
> > end
> > > (09/01). Just seeing "Client session timed out" messages in mirrormaker
> > > log. I restarted to it today 09/03 to resume processing. Here is the
> logs
> > > line in reverse order.
> > >
> > >
> > > 2013-09-03 14:20:40,918
> > >
> > >
> >
> [mirrormakerProd_ops-mmrs1-1-asg.ops.sfdc.net-1378218012575-6779d506_watcher_executor]
> > > INFO  (kafka.utils.VerifiableProperties)  - Verifying properties
> > > 2013-09-03 14:20:40,877
> > >
> > >
> >
> [mirrormakerProd_ops-mmrs1-1-asg.ops.sfdc.net-1378218012575-6779d506_watcher_executor]
> > > INFO  (kafka.consumer.ZookeeperConsumerConnector)  -
> > > [mirrormakerProd_ops-mmrs1-1-asg.ops.sfdc.net-1378218012575-6779d506],
> > > begin rebalancing consumer
> > > mirrormakerProd_ops-mmrs1-1-asg.ops.sfdc.net-1378218012575-6779d506 try
> > #1
> > > 2013-09-03 14:20:38,877
> > >
> > >
> >
> [mirrormakerProd_ops-mmrs1-1-asg.ops.sfdc.net-1378218012575-6779d506_watcher_executor]
> > > INFO  (kafka.consumer.ZookeeperConsumerConnector)  -
> > > [mirrormakerProd_ops-mmrs1-1-asg.ops.sfdc.net-1378218012575-6779d506],
> > > Committing all offsets after clearing the fetcher queues
> > > 2013-09-03 14:20:38,877
> > >
> > >
> >
> [mirrormakerProd_ops-mmrs1-1-asg.ops.sfdc.net-1378218012575-6779d506_watcher_executor]
> > > INFO  (kafka.consumer.ZookeeperConsumerConnector)  -
> > > [mirrormakerProd_ops-mmrs1-1-asg.ops.sfdc.net-1378218012575-6779d506],
> > > Cleared the data chunks in all the consumer message iterators
> > > 2013-09-03 14:20:38,877
> > >
> > >
> >
> [mirrormakerProd_ops-mmrs1-1-asg.ops.sfdc.net-1378218012575-6779d506_watcher_executor]
> > > INFO  (kafka.consumer.ZookeeperConsumerConnector)  -
> > > [mirrormakerProd_ops-mmrs1-1-asg.ops.sfdc.net-1378218012575-6779d506],
> > > Cleared all relevant queues for this fetcher
> > > 2013-09-03 14:20:38,877
> > >
> > >
> >
> [mirrormakerProd_ops-mmrs1-1-asg.ops.sfdc.net-1378218012575-6779d506_watcher_executor]
> > > INFO  (kafka.consumer.ConsumerFetcherManager)  -
> > > [ConsumerFetcherManager-1378218012760] All connections stopped
> > > 2013-09-03 14:20:38,877
> > >
> > >
> >

 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB