Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> Data in File-channel  data folder


Copy link to this message
-
Re: Data in File-channel  data folder
Mike,

The events from the file-channel are consumed by the sink and sent to another flume agent.

I have verified the number through jconsole on the agent and collector.
But the data is still at data directory log-1, log-2  1.6 and 1.6G respectively.
Madhu  Munagala
(214)679-2872

On Apr 11, 2013, at 3:08 PM, Mike Keane <[EMAIL PROTECTED]> wrote:

> Are you sure all your events were taken off the channel by the sink?  
> Did you verify all the data you sent landed at the final destination?  I
> have had my file channel backup like this when sinking to a slow source
> but eventually the file channel empties to a few MB provided I'm not
> adding data faster than the sink can remove it.
>
> I have only seen a similar problem once while evaluating flume but was
> unable to reproduce.  I had 4 parallel flows.  I killed the agents in
> the storage/filter tier (http://blogs.apache.org/flume/) and let logs
> backup up in the collector tier.  I watched the file channels on the
> collector tier grow to tens of GB each before restarting the
> storage/filter tier agents.  3 of the 4 file channels backing the 4
> parallel flows drained to a few MB each.  The 4th however did not.  Even
> after I stopped putting data on the flows and verified all data
> successfully landed in the final sink location the 4th channel was still
> 50+ GB.  I stopped and restarted the agent and the agent iterated
> through all the data/checkpoint files.  Ultimately it sent a couple more
> batches of events but the channel emptied.  
>
> So yes, I have seen your problem however it was either explainable or
> not reproducible.   Explainable in the case where data is added to the
> channel faster than the sink can remove it and not reproducible the one
> time but Flumed fixed itself on a restart.
>
> Because of the one time I witnessed the channel not clearing I will be
> monitoring the file channel size outside of flume as a precaution when
> we move flume to production.
>
> Regards,
>
> Mike
>
>
>
> On 04/11/2013 02:37 PM, Madhu Gmail wrote:
>> Hello,
>>
>> I have not heard from anyone.  so just want make sure I have explained the issue correctly.
>>
>> I think this is a common problem for everyone who uses it flume.
>>
>> when flume sink consumes the log event from file channel,  what will happen to the data that is committed to local disk under data directory.
>>
>> will it grow indefinitely  like log-1, log-2, log-3.....and so on ???
>>
>> do I have to write script to remove the data from data directory  ??
>>
>>
>>
>> Madhu  Munagala
>> (214)679-2872
>>
>> On Apr 11, 2013, at 11:52 AM, Madhu Gmail <[EMAIL PROTECTED]> wrote:
>>
>>> Hello,
>>>
>>> How to clean up the data  in file channel data folder.  After the log events are processed by the sink,  I still see the log-1 and log-2 shows 1.6GB and 1.2GB.
>>>
>>> once the log events are processed by the sink,  the channel should not have any data in data directory under file-channel ....??
>>>
>>>
>>> Madhu  Munagala
>>> (214)679-2872
>
>
>
>
>
> This email and any files included with it may contain privileged,
> proprietary and/or confidential information that is for the sole use
> of the intended recipient(s).  Any disclosure, copying, distribution,
> posting, or use of the information contained in or attached to this
> email is prohibited unless permitted by the sender.  If you have
> received this email in error, please immediately notify the sender
> via return email, telephone, or fax and destroy this original transmission
> and its included files without reading or saving it in any manner.
> Thank you.
>