-Re: HDFS Sink Question
David Sinclair 2013-10-04, 14:42
Thanks Devin. I have looked at the source I can absolutely say for certain
that the connection is never re-established because there is no code that
detects that type of error.
What I was looking for from the devs was confirmation on my findings and
any work arounds besides writing my own HDFS Sink.
Not having this recovery gracefully is a pain and may prevent us from using
On Fri, Oct 4, 2013 at 9:21 AM, DSuiter RDX <[EMAIL PROTECTED]> wrote:
> In experimenting with the file_roll sink for local logging, I noticed that
> the file it wrote to was created when the agent starts. If you start the
> agent, then remove the file, and attempt to write, there is no new file
> created. Perhaps HDFS sink is similar, in that when the sink starts, the
> destination is established, and then if that file chain is broken, Flume
> cannot gracefully detect and correct that. It may have something to do with
> how the sink is looking for the target? I'm not a developer for Flume, but,
> that is my observed behavior on file roll. I am working through kinks in
> hdfs sink with remote TCP logging from rsyslog right now...maybe I will
> have some more insight for you in a few days...
> *Devin Suiter*
> Jr. Data Solutions Software Engineer
> 100 Sandusky Street | 2nd Floor | Pittsburgh, PA 15212
> Google Voice: 412-256-8556 | www.rdx.com
> On Fri, Oct 4, 2013 at 9:08 AM, David Sinclair <
> [EMAIL PROTECTED]> wrote:
>> This is what I am seeing for the scenarios I asked, but wanted
>> confirmation from devs on expected behavior.
>> - HDFS isn't available before ever trying to create/write to a file -
>> * continually tries to create the file and finally succeeds when the
>> cluster is available. *
>> - HDFS becomes unavailable after already creating a file and starting
>> to write to it - *the writer looses the connection, but even after
>> the cluster is available again it never re-establishes a connect. Data loss
>> occurs since it never recovers*
>> - HDFS is unavailable when trying to close a file -* suffers from
>> same problems as above*
>> On Tue, Oct 1, 2013 at 11:04 AM, David Sinclair <
>> [EMAIL PROTECTED]> wrote:
>>> Hi all,
>>> I have created an AMQP Source that is being used to feed an HDFS Sink.
>>> Everything is working as expected, but I wanted to try out some error
>>> After creating a file in HDFS and starting to write to it I shutdown
>>> HDFS. I saw the errors in the log as I would expect, and after the
>>> configured roll time tried to close the file. Since HDFS wasn't running it
>>> wasn't able to do so. I restarted HDFS in hope that it would try the close
>>> again but it did not.
>>> Can someone tell me expected behavior under the following scenarios?
>>> - HDFS isn't available before ever trying to create/write to a file
>>> - HDFS becomes unavailable after already creating a file and
>>> starting to write to it
>>> - HDFS is unavailable when trying to close a file
>>> I'd also be happy to contribute the AMQP source. I wrote the old version
>>> for the original flume
>>> Let me know if you'd be interested and thanks for the answers.