The events from the file-channel are consumed by the sink and sent to another flume agent.
I have verified the number through jconsole on the agent and collector.
But the data is still at data directory log-1, log-2 1.6 and 1.6G respectively.
On Apr 11, 2013, at 3:08 PM, Mike Keane <[EMAIL PROTECTED]> wrote:
> Are you sure all your events were taken off the channel by the sink?
> Did you verify all the data you sent landed at the final destination? I
> have had my file channel backup like this when sinking to a slow source
> but eventually the file channel empties to a few MB provided I'm not
> adding data faster than the sink can remove it.
> I have only seen a similar problem once while evaluating flume but was
> unable to reproduce. I had 4 parallel flows. I killed the agents in
> the storage/filter tier (http://blogs.apache.org/flume/) and let logs
> backup up in the collector tier. I watched the file channels on the
> collector tier grow to tens of GB each before restarting the
> storage/filter tier agents. 3 of the 4 file channels backing the 4
> parallel flows drained to a few MB each. The 4th however did not. Even
> after I stopped putting data on the flows and verified all data
> successfully landed in the final sink location the 4th channel was still
> 50+ GB. I stopped and restarted the agent and the agent iterated
> through all the data/checkpoint files. Ultimately it sent a couple more
> batches of events but the channel emptied.
> So yes, I have seen your problem however it was either explainable or
> not reproducible. Explainable in the case where data is added to the
> channel faster than the sink can remove it and not reproducible the one
> time but Flumed fixed itself on a restart.
> Because of the one time I witnessed the channel not clearing I will be
> monitoring the file channel size outside of flume as a precaution when
> we move flume to production.
> On 04/11/2013 02:37 PM, Madhu Gmail wrote:
>> I have not heard from anyone. so just want make sure I have explained the issue correctly.
>> I think this is a common problem for everyone who uses it flume.
>> when flume sink consumes the log event from file channel, what will happen to the data that is committed to local disk under data directory.
>> will it grow indefinitely like log-1, log-2, log-3.....and so on ???
>> do I have to write script to remove the data from data directory ??
>> Madhu Munagala
>> On Apr 11, 2013, at 11:52 AM, Madhu Gmail <[EMAIL PROTECTED]> wrote:
>>> How to clean up the data in file channel data folder. After the log events are processed by the sink, I still see the log-1 and log-2 shows 1.6GB and 1.2GB.
>>> once the log events are processed by the sink, the channel should not have any data in data directory under file-channel ....??
>>> Madhu Munagala
> This email and any files included with it may contain privileged,
> proprietary and/or confidential information that is for the sole use
> of the intended recipient(s). Any disclosure, copying, distribution,
> posting, or use of the information contained in or attached to this
> email is prohibited unless permitted by the sender. If you have
> received this email in error, please immediately notify the sender
> via return email, telephone, or fax and destroy this original transmission
> and its included files without reading or saving it in any manner.
> Thank you.