Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> Flume is replaying log for hours now


Copy link to this message
-
Re: Flume is replaying log for hours now
use-fast-replay would help but you'd need 4-5GB of heap per channel. With
heaps that large you use be using dual checkpointing to avoid this.

Here is the thread doing the replay:

"lifecycleSupervisor-1-0" prio=10 tid=0x00007f040472c800 nid=0x1332b
runnable [0x00007f03f84ce000]
   java.lang.Thread.State: RUNNABLE
        at org.apache.flume.channel.file.FlumeEventQueue.remove(FlumeEventQueue.java:194)
        - locked <0x00000007256d3dc8> (a
org.apache.flume.channel.file.FlumeEventQueue)
        at org.apache.flume.channel.file.ReplayHandler.processCommit(ReplayHandler.java:405)
        at org.apache.flume.channel.file.ReplayHandler.replayLog(ReplayHandler.java:328)
        at org.apache.flume.channel.file.Log.doReplay(Log.java:503)
        at org.apache.flume.channel.file.Log.replay(Log.java:430)
        at org.apache.flume.channel.file.FileChannel.start(FileChannel.java:302)
        - locked <0x00000007256d2e38> (a
org.apache.flume.channel.file.FileChannel)
        at org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251)
        - locked <0x00000007256d2e38> (a
org.apache.flume.channel.file.FileChannel)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:722)

On Thu, Aug 8, 2013 at 12:52 AM, Anat Rozenzon <[EMAIL PROTECTED]> wrote:

> Hi,
>
> I'm trying to restart Flume. My setup is:
>
> Acro source => File channel 1 => HDFS sink
>                    => File channel 2 => Another HDFS sink
>                    => File channel 3 => File sink
>
> But it seem to be doing replayLog for hours now, after seeing this
> yesterday, I even tried setting use-fast-replay=true, but it didn't help.
>
> Each file channel capacity is 100000000, is this too high for Flume? I
> started on lower number but then it complained that the channel is getting
> filled so I made it higher.
>
> My log is repeatedly writing such lines:
> 08 Aug 2013 01:36:22,856 INFO  [lifecycleSupervisor-1-1]
> (org.apache.flume.channel.file.ReplayHandler.replayLog:293)  - Read 3240000
> records
> 08 Aug 2013 01:36:41,324 INFO  [lifecycleSupervisor-1-0]
> (org.apache.flume.channel.file.ReplayHandler.replayLog:293)  - Read 3350000
> records
> 08 Aug 2013 01:38:35,794 INFO  [lifecycleSupervisor-1-1]
> (org.apache.flume.channel.file.ReplayHandler.replayLog:293)  - Read 3250000
> records
> 08 Aug 2013 01:40:48,759 INFO  [lifecycleSupervisor-1-1]
> (org.apache.flume.channel.file.ReplayHandler.replayLog:293)  - Read 3260000
> records
> 08 Aug 2013 01:41:01,684 INFO  [lifecycleSupervisor-1-0]
> (org.apache.flume.channel.file.ReplayHandler.replayLog:293)  - Read 4090000
> records
> 08 Aug 2013 01:41:36,691 INFO  [lifecycleSupervisor-1-0]
> (org.apache.flume.channel.file.ReplayHandler.replayLog:293)  - Read 4100000
> records
> 08 Aug 2013 01:42:27,528 INFO  [lifecycleSupervisor-1-0]
> (org.apache.flume.channel.file.ReplayHandler.replayLog:293)  - Read 4110000
> records
> 08 Aug 2013 01:42:57,725 INFO  [lifecycleSupervisor-1-1]
> (org.apache.flume.channel.file.ReplayHandler.replayLog:293)  - Read 3270000
> records
>
>
> In attaching jstack output, I wasn't sure what the threads are doing but
> in any case many of them seem to be waiting..
>
> Any idea what I can do to make the server start?
>
> Thanks
> Anat
>
>
--
Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB