We are using flume's HDFS sink to store log data in Amazon S3 and we are facing some throughput issues. In our flume config we have an avro source, a file channel and the hdfs sink. The file channel is configured on a provisioned IOPS EBS volume and we are running on an m1.large EC2 instance (flume 1.4.0, java 1.7.0).
Below you will find an example metric from our s3-file-channel. The main issue is that the "EventTakeSuccessCount" can't cope with the "EventPutSuccessCount" and as a result our "ChannelSize" increases over time.
We tried to use multiple hdfs-sinks but it didn't have any positive effect. Strangely, the problem is still there even when a memory channel is used. Another interesting fact is that we are also using an identical file-channel with the elasticsearch-sink and under the same load we don't have any throughput issues.
We would appreciate any suggestions that could help us improve the performance of the hdfs sink.
http://www.bbc.co.uk This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated. If you have received it in error, please delete it from your system. Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately. Please note that the BBC monitors e-mails sent or received. Further communication will signify your consent to this.
Thanks for the response. No, we don't get any errors in the flume log, everything seems to be ok.. it is just not performing as expected. For reference, I'm also attaching some metrics from the elasticsearch file-channel.
it would be useful to see you HDFS sink config On Wed, Mar 5, 2014 at 11:47 AM, Nikolaos Tsipas <[EMAIL PROTECTED]>wrote: CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
NEW: Monitor These Apps!
Apache Lucene, Apache Solr and all other Apache Software Foundation projects and their respective logos are trademarks of the Apache Software Foundation.
Elasticsearch, Kibana, Logstash, and Beats are trademarks of Elasticsearch BV, registered in the U.S. and in other countries. This site and Sematext Group is in no way affiliated with Elasticsearch BV.
Service operated by Sematext