Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> 4 times disk consumption?


Copy link to this message
-
Re: 4 times disk consumption?
If you are concerned about disk space consumption you should lower the
max log size on the file channel. The exact parameter is in the docs.

On Tue, Sep 10, 2013 at 12:17 AM, Anat Rozenzon <[EMAIL PROTECTED]> wrote:
> After leaving flume to run in this state (sink is not sending the events),
> the disk space has now grown to 3.4G!
> I see the same files COMPLETED as yesterday so no new events were read into
> the channel, yet the channel keeps growing!
>
> I see this file structure under the file channel work folder:
>
> [root@HTS4 old_logs]# du -sh flume/filechannel/data/*
> 0       flume/filechannel/data/in_use.lock
> 1.6G    flume/filechannel/data/log-1
> 4.0K    flume/filechannel/data/log-1.meta
> 1.6G    flume/filechannel/data/log-2
> 4.0K    flume/filechannel/data/log-2.meta
> 338M    flume/filechannel/data/log-3
> 4.0K    flume/filechannel/data/log-3.meta
>
> Any way to avoid this behavior?
>
> ---------- Forwarded message ----------
> From: Anat Rozenzon <[EMAIL PROTECTED]>
> Date: Mon, Sep 9, 2013 at 3:00 PM
> Subject: 4 times disk consumption?
> To: [EMAIL PROTECTED]
>
>
> Hi,
>
> I have a directory spooler connected to a file channel, currently with a
> non-working sink.
> Channel capacity is 200M (events?!), since the sink is not working, the
> channel gets filled.
>
> However, I see that although the original files total size is 150M, the full
> file channel isusing almost 4 times that disk space (i.e. 550M).
>
> Any idea why? is this the expected ratio between original size and file
> channel disk usage?
>
> Thanks
> Anat
>

--
Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org