Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # user >> File Channel Capacity issue

Camp, Roy 2012-11-23, 19:58
Hari Shreedharan 2012-11-23, 21:14
Camp, Roy 2012-11-25, 23:08
Brock Noland 2012-11-25, 23:13
Copy link to this message
RE: File Channel Capacity issue

I'm a bit confused by this.  Are you saying that after the FileChannel is full the events would be held in heap?  My understanding was that when the FileChannel is full and you receive the error that I was seeing, that the transaction would be considered failed and that the upstream flume instance would be responsible for keeping the event in its channel.  Since my upstream flume instances also uses the FileChannel as well, then those events would remain stored on disk.  If I'm not using a memory channel why would I need that much heap?

From: Brock Noland [mailto:[EMAIL PROTECTED]]
Sent: Sunday, November 25, 2012 3:14 PM
Subject: Re: File Channel Capacity issue


If the channel were full, each event requires 32 bytes of in-heap memory. So an agent with a capacity of 100 million would require heap of at the very least 3GB but I would set 3.5GB to be safe.

On Sun, Nov 25, 2012 at 5:08 PM, Camp, Roy <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
Ah okay, thanks!  I think I might just start a second instance of flume with an updated config and then flip the application layer over.

Is there a maximum value that the FIleChannel capacity will accept?  Would setting this value really high result in any performance impact?

Each of my collectors processes about 10MM events per day, so in the event the sink fails during a weekend, I don't want it to fill up and lose events.  I'm thinking of setting the capacity to 100MM.


From: Hari Shreedharan [mailto:[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>]
Sent: Friday, November 23, 2012 1:14 PM
Subject: Re: File Channel Capacity issue


The FileChannel actually uses a fixed size checkpoint file -- so it is not possible to set it to unlimited size (the checkpoint file is mmap-ed to a fixed size buffer). To change the capacity of the channel, the easiest way off the top of my head is:

* Shutdown the agent.
* Delete all files in the file channel's checkpoint directory. (not the data directories. Also you might want to move them out, rather than delete to be safe)
* Change your configuration to increase the capacity of the channel.
* Restart the agent - this will cause full replay, so the agent might take sometime to start up if there are a lot of events in the channel (to avoid this - shutdown the source before shutting the agent down - so the sink can drain out the channel completely, wait for about 1-2 mins after the channel is empty so that the data files get deleted (this happens only immediately after a checkpoint - you can verify this by making sure each data dir has only 2 files each), since all events have been sent out - so during restart the channel will be quite empty, with very little to replay).
Hope this helps.


Hari Shreedharan
On Friday, November 23, 2012 at 11:58 AM, Camp, Roy wrote:

I am having issue with a slow sink (not flume related) but it is causing my file channel to overflow.  When trying to increase it I get the following error but can't seem to find the setting it refers to in any of the documentation.  Additionally, is there a way to set the file channel to unlimited?  It doesn't seem to like 0.

java.lang.IllegalStateException: Channel closed [channel=collectorfile]. Due to java.lang.IllegalStateException: Configured capacity is 100000000 but the  checkpoint file capacity is 1000000. See FileChannel documentation on how to change a channels capacity.


Apache MRUnit - Unit testing MapReduce - http://incubator.apache.org/mrunit/
Brock Noland 2012-11-26, 23:57