Bhaskar V. Karambelkar 2013-10-15, 19:24
Hari Shreedharan 2013-10-15, 19:54
Source is Avro Source which gets evnets fed by a custom JVM application
using the flume client SDK.
So referring to the client SDK, if the batchSize property has be set to
1,000, but I pass say 10,000 events in the client.addBatch(List<Event>) call
what happens ?
On Tue, Oct 15, 2013 at 3:54 PM, Hari Shreedharan <[EMAIL PROTECTED]
> What source are you using? Looks like the source is writing > 5K events
> in one transaction
> On Tuesday, October 15, 2013 at 12:24 PM, Bhaskar V. Karambelkar wrote:
> Recently we switched over from Memory Channel to File Channel, as Memory
> Channel has some GC issues.
> Occasionally in File Channel I see this exception
> org.apache.flume.ChannelException: Put queue for FileBackedTransaction of
> capacity 5000 full, consider committing more frequently, increasing
> capacity or increasing thread count. [channel=fileChannelD1]
> Client batchSize is 1,000, and HDFS Sink batch size is also 1,000.
> The channel capacity is 1M (1,000,000), and Channel Tx Capacity is 5,000
> The underlying directories are not full, so the channel should have enough
> space, nor does the channel has any backlog.
> What I'm confused by are the 3 options the Exception mentions.
> How do I , commit more frequently ? or increase capacity ? (Capacity of
> Channel is 1M, and that is not full), or increase thread count ?( I see no
> option of thread count in file channel, or is this referring to threadcout
> of the HDFS sink which reads from this sink ?)
> Lastly, would GC in Hadoop (mostly Namenode) cause HDFS Timeout issues in
> HDFS Sink, coz we see HDFS Timeout errors, more or less at the same time
> across all our flume nodes, so I suspect it could be NameNode GC causing
> timeout issues.