Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # user >> Data in File-channel  data folder


+
Madhu Gmail 2013-04-11, 16:52
+
Madhu Gmail 2013-04-11, 19:13
+
Hari Shreedharan 2013-04-11, 19:57
+
Madhu Gmail 2013-04-11, 20:11
+
Mike Keane 2013-04-11, 20:08
Copy link to this message
-
Re: Data in File-channel  data folder
Mike,

The events from the file-channel are consumed by the sink and sent to another flume agent.

I have verified the number through jconsole on the agent and collector.
But the data is still at data directory log-1, log-2  1.6 and 1.6G respectively.
Madhu  Munagala
(214)679-2872

On Apr 11, 2013, at 3:08 PM, Mike Keane <[EMAIL PROTECTED]> wrote:

> Are you sure all your events were taken off the channel by the sink?  
> Did you verify all the data you sent landed at the final destination?  I
> have had my file channel backup like this when sinking to a slow source
> but eventually the file channel empties to a few MB provided I'm not
> adding data faster than the sink can remove it.
>
> I have only seen a similar problem once while evaluating flume but was
> unable to reproduce.  I had 4 parallel flows.  I killed the agents in
> the storage/filter tier (http://blogs.apache.org/flume/) and let logs
> backup up in the collector tier.  I watched the file channels on the
> collector tier grow to tens of GB each before restarting the
> storage/filter tier agents.  3 of the 4 file channels backing the 4
> parallel flows drained to a few MB each.  The 4th however did not.  Even
> after I stopped putting data on the flows and verified all data
> successfully landed in the final sink location the 4th channel was still
> 50+ GB.  I stopped and restarted the agent and the agent iterated
> through all the data/checkpoint files.  Ultimately it sent a couple more
> batches of events but the channel emptied.  
>
> So yes, I have seen your problem however it was either explainable or
> not reproducible.   Explainable in the case where data is added to the
> channel faster than the sink can remove it and not reproducible the one
> time but Flumed fixed itself on a restart.
>
> Because of the one time I witnessed the channel not clearing I will be
> monitoring the file channel size outside of flume as a precaution when
> we move flume to production.
>
> Regards,
>
> Mike
>
>
>
> On 04/11/2013 02:37 PM, Madhu Gmail wrote:
>> Hello,
>>
>> I have not heard from anyone.  so just want make sure I have explained the issue correctly.
>>
>> I think this is a common problem for everyone who uses it flume.
>>
>> when flume sink consumes the log event from file channel,  what will happen to the data that is committed to local disk under data directory.
>>
>> will it grow indefinitely  like log-1, log-2, log-3.....and so on ???
>>
>> do I have to write script to remove the data from data directory  ??
>>
>>
>>
>> Madhu  Munagala
>> (214)679-2872
>>
>> On Apr 11, 2013, at 11:52 AM, Madhu Gmail <[EMAIL PROTECTED]> wrote:
>>
>>> Hello,
>>>
>>> How to clean up the data  in file channel data folder.  After the log events are processed by the sink,  I still see the log-1 and log-2 shows 1.6GB and 1.2GB.
>>>
>>> once the log events are processed by the sink,  the channel should not have any data in data directory under file-channel ....??
>>>
>>>
>>> Madhu  Munagala
>>> (214)679-2872
>
>
>
>
>
> This email and any files included with it may contain privileged,
> proprietary and/or confidential information that is for the sole use
> of the intended recipient(s).  Any disclosure, copying, distribution,
> posting, or use of the information contained in or attached to this
> email is prohibited unless permitted by the sender.  If you have
> received this email in error, please immediately notify the sender
> via return email, telephone, or fax and destroy this original transmission
> and its included files without reading or saving it in any manner.
> Thank you.
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB