Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> Flume error in FileChannel


Copy link to this message
-
Re: Flume error in FileChannel
Source is Avro Source which gets evnets fed by a custom JVM application
using the flume client SDK.

So referring to the client SDK, if the batchSize property has be set to
1,000, but I pass say 10,000 events in the client.addBatch(List<Event>) call
what happens ?
On Tue, Oct 15, 2013 at 3:54 PM, Hari Shreedharan <[EMAIL PROTECTED]
> wrote:

>  What source are you using? Looks like the source is writing > 5K events
> in one transaction
>
>
> Thanks,
> Hari
>
> On Tuesday, October 15, 2013 at 12:24 PM, Bhaskar V. Karambelkar wrote:
>
> Recently we switched over from Memory Channel to File Channel, as Memory
> Channel has some GC issues.
> Occasionally in File Channel I see this exception
>
> org.apache.flume.ChannelException: Put queue for FileBackedTransaction of
> capacity 5000 full, consider committing more frequently, increasing
> capacity or increasing thread count. [channel=fileChannelD1]
>
> Client batchSize is 1,000, and HDFS Sink batch size is also 1,000.
> The channel capacity is 1M (1,000,000), and Channel Tx Capacity is 5,000
>
> The underlying directories are not full, so the channel should have enough
> space, nor does the channel has any backlog.
>
> What I'm confused by are the 3 options the Exception mentions.
>
> How do I , commit more frequently ? or increase capacity ? (Capacity of
> Channel is 1M, and that is not full), or increase thread count ?( I see no
> option of thread count in file channel, or is this referring to threadcout
> of the HDFS sink which reads from this sink ?)
>
> Lastly, would GC in Hadoop (mostly Namenode) cause HDFS Timeout issues in
> HDFS Sink, coz we see HDFS Timeout errors, more or less at the same time
> across all our flume nodes, so I suspect it could be NameNode GC causing
> timeout issues.
>
>
> thanks
> Bhaskar
>
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB