Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> The size of data folder continue grow


Copy link to this message
-
Re: The size of data folder continue grow
On Tue, Nov 26, 2013 at 7:41 AM, GuoWei <[EMAIL PROTECTED]> wrote:
> Dear all,
>
> I use flume and custom base sink to put data to HBase. In flume, I use file channel.
>
> the file channel data put in the following folder.
>
> razor.channels.c_error.dataDirs = /var/lib/flume-ng/data/error
>
> But I see the channel data folder size continue grow until to 2gb.
>
> Does flume remove the data in channel data folder after flush to HBase? Or still store in data folder after flush to HBase ?

By default the file channel will keep any log which has data in the
channel in addition to 2 logs for safety purposes.  If you have a
small channel which is empty and simply don't want to store that data,
then you can turn down the max file size for the file channel.

> Or the speed sink to HBase is too slower than source ?

You'd need look at the metrics to decide this.

> And how to speed up the sink to HBase ?

I suggest you first decide if that is actually a problem, but a common
issue with hbase is called "hot spotting" due to bad key design.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB