Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # user >> 4 times disk consumption?


+
Anat Rozenzon 2013-09-09, 12:00
+
Brock Noland 2013-09-10, 05:23
Copy link to this message
-
Re: 4 times disk consumption?
Thanks Brock!

I see a parameter called maxFileSize on the file channel:
maxFileSize 2146435071 Max size (in bytes) of a single log file
Is that what you mean?

However I have 3 log files (and probably could have more if it didn't reach
the minimumRequiredSpace), together they use more than the default 2G of
this parameter.
On Tue, Sep 10, 2013 at 8:23 AM, Brock Noland <[EMAIL PROTECTED]> wrote:

> If you are concerned about disk space consumption you should lower the
> max log size on the file channel. The exact parameter is in the docs.
>
> On Tue, Sep 10, 2013 at 12:17 AM, Anat Rozenzon <[EMAIL PROTECTED]> wrote:
> > After leaving flume to run in this state (sink is not sending the
> events),
> > the disk space has now grown to 3.4G!
> > I see the same files COMPLETED as yesterday so no new events were read
> into
> > the channel, yet the channel keeps growing!
> >
> > I see this file structure under the file channel work folder:
> >
> > [root@HTS4 old_logs]# du -sh flume/filechannel/data/*
> > 0       flume/filechannel/data/in_use.lock
> > 1.6G    flume/filechannel/data/log-1
> > 4.0K    flume/filechannel/data/log-1.meta
> > 1.6G    flume/filechannel/data/log-2
> > 4.0K    flume/filechannel/data/log-2.meta
> > 338M    flume/filechannel/data/log-3
> > 4.0K    flume/filechannel/data/log-3.meta
> >
> > Any way to avoid this behavior?
> >
> > ---------- Forwarded message ----------
> > From: Anat Rozenzon <[EMAIL PROTECTED]>
> > Date: Mon, Sep 9, 2013 at 3:00 PM
> > Subject: 4 times disk consumption?
> > To: [EMAIL PROTECTED]
> >
> >
> > Hi,
> >
> > I have a directory spooler connected to a file channel, currently with a
> > non-working sink.
> > Channel capacity is 200M (events?!), since the sink is not working, the
> > channel gets filled.
> >
> > However, I see that although the original files total size is 150M, the
> full
> > file channel isusing almost 4 times that disk space (i.e. 550M).
> >
> > Any idea why? is this the expected ratio between original size and file
> > channel disk usage?
> >
> > Thanks
> > Anat
> >
>
>
>
> --
> Apache MRUnit - Unit testing MapReduce - http://mrunit.apache.org
>
+
Brock Noland 2013-09-10, 15:44