Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume, mail # user - Flume Data Directory Cleanup


Copy link to this message
-
Re: Flume Data Directory Cleanup
Jeremy Karlson 2013-07-18, 19:24
I did a hard delete.  (I was out of disk space.)  I ended up just deleting
the whole channel directory and starting fresh.

I am running a very recent version, so I don't think I'd be affected by the
file removal bug...  And obviously my files were still in use, for reasons
I don't understand yet.

-- Jeremy
On Thu, Jul 18, 2013 at 11:09 AM, Hari Shreedharan <
[EMAIL PROTECTED]> wrote:

> Flume's deletion strategy is quite conservative. We do wait for 2
> checkpoints after all data was removed from a file before the files are
> deleted. In this case, it does look like the data was actually still
> referenced. We had a bug sometime back that caused files to not be deleted
> - but that was fixed quite a while back.
>
>
> Hari
>
>
> Thanks,
> Hari
>
> On Thursday, July 18, 2013 at 10:56 AM, Camp, Roy wrote:
>
>  We have noticed a few times that cleanup did not happen properly but a
> restart generally forced a cleanup.  ****
>
> ** **
>
> I would recommend putting the data files back unless you did a hard
> delete.  Alternatively, make sure you remove (backup first) the checkpoint
> files if you delete the data files.  That should put flume back to a fresh
> state. ****
>
> ** **
>
> Roy****
>
> ** **
>
> ** **
>
> ** **
>
> *From:* Jeremy Karlson [mailto:[EMAIL PROTECTED]<[EMAIL PROTECTED]>]
>
> *Sent:* Thursday, July 18, 2013 10:42 AM
> *To:* [EMAIL PROTECTED]
> *Subject:* Re: Flume Data Directory Cleanup****
>
> ** **
>
> Thank you for your suggestion.  I took a careful look at that, and I'm not
> sure it describes my situation.  That refers to the sink, while my problem
> is with the channel.  I'm looking at a dramatic accumulation of log / meta
> files within the channel data directory.
>
> Additionally, I did try doing a manual cleanup of the channel directory,
> deleting the oldest log / meta files.  (This was my experiment.)  Flume
> really did not like that.  If it is required in the channel as well, the
> cutoff point at which the files go from being used to unused is not clear
> to me.****
>
> ** **
>
> -- Jeremy****
>
> ** **
>
> On Thu, Jul 18, 2013 at 10:13 AM, Lenin Raj <[EMAIL PROTECTED]> wrote:*
> ***
>
> Hi Jeremy,
>
> Regarding cleanup, it was discussed already once.
>
>
> http://mail-archives.apache.org/mod_mbox/flume-user/201306.mbox/%[EMAIL PROTECTED]%3E
> ****
>
> You have to do it manually.****
>
>
> ****
>
>
> Thanks,
> Lenin****
>
> ** **
>
> On Thu, Jul 18, 2013 at 10:36 PM, Jeremy Karlson <[EMAIL PROTECTED]>
> wrote:****
>
> To follow up:****
>
> ** **
>
> My Flume agent ran out of disk space last night and appeared to stop
> processing.  I shut it down and as an experiment (it's a test machine, why
> not?) I deleted the oldest 10 data files, to see if Flume actually needed
> these when it restarted.****
>
> ** **
>
> Flume was not happy with my choices.****
>
> ** **
>
> It spit out a lot of this:****
>
> ** **
>
> 2013-07-18 00:00:00,013 ERROR [pool-40-thread-1]        o.a.f.s.AvroSource
> Avro source mySource: Unable to process event batch. Exception follows.
> java.lang.IllegalStateException: Channel closed [channel=myFileChannel].
> Due to java.lang.NullPointerException: null****
>
>         at
> org.apache.flume.channel.file.FileChannel.createTransaction(FileChannel.java:353)
> ****
>
>         at
> org.apache.flume.channel.BasicChannelSemantics.getTransaction(BasicChannelSemantics.java:122)
> ****
>
>         ...****
>
> Caused by: java.lang.NullPointerException****
>
>         at org.apache.flume.channel.file.Log.writeCheckpoint(Log.java:895)
> ****
>
>         at org.apache.flume.channel.file.Log.replay(Log.java:406)****
>
>         at
> org.apache.flume.channel.file.FileChannel.start(FileChannel.java:303)****
>
>         ...****
>
> ** **
>
> So it seems like these files were actually in use, and not just leftover
> cruft.  A worthwhile thing to know, but I'd like to understand why.  My
> events are probably at most 1k of text, so it seems kind of odd to me that