Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # user >> Fwd: 4 times disk consumption?


+
Anat Rozenzon 2013-09-10, 05:17
Copy link to this message
-
Fwd: 4 times disk consumption?
I tried opening the sink now but it seems that it can't take events from
the channel (as it reached the minimumRequiredSpace), see below the error
mesage.
Any way I can continue?

10 Sep 2013 04:27:23,282 WARN
[SinkRunner-PollingRunner-LoadBalancingSinkProcessor]
(org.apache.flume.sink.LoadBalancingSinkProcessor.process:158)  - Sink
failed to consume event. Attempting next sink if available.
java.lang.IllegalStateException: Channel closed [channel=fileChannel]. Due
to java.io.IOException: Usable space exhaused, only 402567168 bytes
remaining, required 524288000 bytes
        at
org.apache.flume.channel.file.FileChannel.createTransaction(FileChannel.java:352)
        at
org.apache.flume.channel.BasicChannelSemantics.getTransaction(BasicChannelSemantics.java:122)
        at
org.apache.flume.sink.AbstractRpcSink.process(AbstractRpcSink.java:333)
        at
org.apache.flume.sink.LoadBalancingSinkProcessor.process(LoadBalancingSinkProcessor.java:154)
        at
org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
        at java.lang.Thread.run(Unknown Source)
Caused by: java.io.IOException: Usable space exhaused, only 402567168 bytes
remaining, required 524288000 bytes
        at org.apache.flume.channel.file.Log.writeCheckpoint(Log.java:985)
        at org.apache.flume.channel.file.Log.replay(Log.java:472)
        at
org.apache.flume.channel.file.FileChannel.start(FileChannel.java:302)
        at
org.apache.flume.lifecycle.LifecycleSupervisor$MonitorRunnable.run(LifecycleSupervisor.java:251)
        at java.util.concurrent.Executors$RunnableAdapter.call(Unknown
Source)
        at java.util.concurrent.FutureTask$Sync.innerRunAndReset(Unknown
Source)
        at java.util.concurrent.FutureTask.runAndReset(Unknown Source)
        at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(Unknown
Source)
        at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown
Source)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown
Source)
        ... 1 more
---------- Forwarded message ----------
From: Anat Rozenzon <[EMAIL PROTECTED]>
Date: Tue, Sep 10, 2013 at 8:43 AM
Subject: Re: 4 times disk consumption?
To: [EMAIL PROTECTED]
Thanks Brock!

I see a parameter called maxFileSize on the file channel:
maxFileSize 2146435071 Max size (in bytes) of a single log file
Is that what you mean?

However I have 3 log files (and probably could have more if it didn't reach
the minimumRequiredSpace), together they use more than the default 2G of
this parameter.
On Tue, Sep 10, 2013 at 8:23 AM, Brock Noland <[EMAIL PROTECTED]> wrote:

> If you are concerned about disk space consumption you should lower the
> max log size on the file channel. The exact parameter is in the docs.
>
> On Tue, Sep 10, 2013 at 12:17 AM, Anat Rozenzon <[EMAIL PROTECTED]> wrote:
> > After leaving flume to run in this state (sink is not sending the
> events),
> > the disk space has now grown to 3.4G!
> > I see the same files COMPLETED as yesterday so no new events were read
> into
> > the channel, yet the channel keeps growing!
> >
> > I see this file structure under the file channel work folder:
> >
> > [root@HTS4 old_logs]# du -sh flume/filechannel/data/*
> > 0       flume/filechannel/data/in_use.lock
> > 1.6G    flume/filechannel/data/log-1
> > 4.0K    flume/filechannel/data/log-1.meta
> > 1.6G    flume/filechannel/data/log-2
> > 4.0K    flume/filechannel/data/log-2.meta
> > 338M    flume/filechannel/data/log-3
> > 4.0K    flume/filechannel/data/log-3.meta
> >
> > Any way to avoid this behavior?
> >
> > ---------- Forwarded message ----------
> > From: Anat Rozenzon <[EMAIL PROTECTED]>
> > Date: Mon, Sep 9, 2013 at 3:00 PM
> > Subject: 4 times disk consumption?
> > To: [EMAIL PROTECTED]
> >
> >
> > Hi,
> >
> > I have a directory spooler connected to a file channel, currently with a
> > non-working sink.
> > Channel capacity is 200M (events?!), since the sink is not working, the