Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> The size of data folder continue grow


Copy link to this message
-
Re: The size of data folder continue grow
On Tue, Nov 26, 2013 at 7:41 AM, GuoWei <[EMAIL PROTECTED]> wrote:
> Dear all,
>
> I use flume and custom base sink to put data to HBase. In flume, I use file channel.
>
> the file channel data put in the following folder.
>
> razor.channels.c_error.dataDirs = /var/lib/flume-ng/data/error
>
> But I see the channel data folder size continue grow until to 2gb.
>
> Does flume remove the data in channel data folder after flush to HBase? Or still store in data folder after flush to HBase ?

By default the file channel will keep any log which has data in the
channel in addition to 2 logs for safety purposes.  If you have a
small channel which is empty and simply don't want to store that data,
then you can turn down the max file size for the file channel.

> Or the speed sink to HBase is too slower than source ?

You'd need look at the metrics to decide this.

> And how to speed up the sink to HBase ?

I suggest you first decide if that is actually a problem, but a common
issue with hbase is called "hot spotting" due to bad key design.