Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume, mail # dev - Slow behaviour on replay from FileChannel


Copy link to this message
-
Re: Slow behaviour on replay from FileChannel
Edward Sargisson 2013-06-23, 13:27
Hi Mike,
No - not at present. :-(

There's a possibility that a repro exists on a certain non-critical server
which I will need to fix sometime next week. If it happens to have the same
failure I'll see if I can figure something out.

Cheers,
Edward

"Edward,
Someone told me they saw similar behavior but that it seemed intermittent /
not consistent. I haven't seen this, typically the FC is very fast with
replay. Any update on this?

Thanks,
Mike

On Mon, Jun 17, 2013 at 3:53 PM, Edward Sargisson <[EMAIL PROTECTED]> wrote:

> Hi all,
> This may be a user question so feel free to punt me to that list. However,
> I've just seen behaviour which seems mighty slow and I don't understand
> why.
>
> I restarted one of our Flume agents and it took about 23 minutes before it
> was ready to accept new events. The logs seem to indicate that it took the
> majority of that time to workthrough the data file that only had 6885
> events in it. This seems mighty slow to me.
>
> Does anybody have an explanation for this? Is there something I should do
> in the future to bring it back up faster? I looked at the code and there's
> nothing obviously slow about it.
>
> Many thanks,
> Edward
>
> Log snippet (filtered to be only this thread and large number of Pending
> take messages removed):
> 2013-06-17 21:53:20,154  INFO [lifecycleSupervisor-1-1]
> o.a.f.c.f.FileChannel Starting FileChannel troubleshootingFileChannel {
> dataDirs: [/var/local/flume/troubleshooting-file-channel/data] }...
> 2013-06-17 21:53:20,155  INFO [lifecycleSupervisor-1-1]
> o.a.f.c.f.Log Encryption is not enabled
> 2013-06-17 21:53:20,155  INFO [lifecycleSupervisor-1-1]
> o.a.f.c.f.Log Replay started
> 2013-06-17 21:53:20,165  INFO [lifecycleSupervisor-1-1]
> o.a.f.c.f.Log Found NextFileID 20, from
> [/var/local/flume/troubleshooting-file-channel/data/log-20,
> /var/local/flume/troubleshooting-file-channel/data/log-19]
> 2013-06-17 21:53:20,172  INFO [lifecycleSupervisor-1-1]
> o.a.f.c.f.EventQueueBackingStoreFileV3 Starting up with
> /var/local/flume/troubleshooting-file-channel/checkpoint/checkpoint and
> /var/local/flume/troubleshooting-file-channel/checkpoint/checkpoint.meta
> 2013-06-17 21:53:20,172  INFO [lifecycleSupervisor-1-1]
> o.a.f.c.f.EventQueueBackingStoreFileV3 Reading checkpoint metadata from
> /var/local/flume/troubleshooting-file-channel/checkpoint/checkpoint.meta
> 2013-06-17 21:53:20,213  INFO [lifecycleSupervisor-1-1]
> o.a.f.c.f.Log Last Checkpoint Mon Jun 17 21:04:26 UTC 2013, queue depth 0
> 2013-06-17 21:53:20,222  INFO [lifecycleSupervisor-1-1]
> o.a.f.c.f.Log Replaying logs with v2 replay logic
> 2013-06-17 21:53:20,225  INFO [lifecycleSupervisor-1-1]
> o.a.f.c.f.ReplayHandler Starting replay of
> [/var/local/flume/troubleshooting-file-channel/data/log-19,
> /var/local/flume/troubleshooting-file-channel/data/log-20]
> 2013-06-17 21:53:20,226  INFO [lifecycleSupervisor-1-1]
> o.a.f.c.f.ReplayHandler Replaying
> /var/local/flume/troubleshooting-file-channel/data/log-19
> 2013-06-17 21:53:20,275  WARN [lifecycleSupervisor-1-1]
> o.a.f.c.f.LogFile Checkpoint for
> file(/var/local/flume/troubleshooting-file-channel/data/log-19) is:
> 1371488755062, which is beyond the requested checkpoint time: 0 and
> position 284327361
> 2013-06-17 21:53:20,287  INFO [lifecycleSupervisor-1-1]
> o.a.f.c.f.ReplayHandler Replaying
> /var/local/flume/troubleshooting-file-channel/data/log-20
> 2013-06-17 21:53:20,288  WARN [lifecycleSupervisor-1-1]
> o.a.f.c.f.LogFile Checkpoint for
> file(/var/local/flume/troubleshooting-file-channel/data/log-20) is:
> 1371488770226, which is beyond the requested checkpoint time: 0 and
> position 7078049
> 2013-06-17 22:16:16,161  INFO [lifecycleSupervisor-1-1]
> o.a.f.c.f.LogFile Encountered EOF at 284348767 in
> /var/local/flume/troubleshooting-file-channel/data/log-19
> 2013-06-17 22:16:37,802  INFO [lifecycleSupervisor-1-1]
> o.a.f.c.f.LogFile Encountered EOF at 7266532 in
> /var/local/flume/troubleshooting-file-channel/data/log-20
> 2013-06-17 22:16:37,805  INFO [lifecycleSupervisor-1-1]
data