Matt Wise 2013-05-10, 17:29
Matt Wise 2013-05-10, 18:00
Mike Percy 2013-05-11, 01:30
Matt Wise 2013-05-13, 19:13
Mike Percy 2013-05-22, 07:39
We do that .. but somehow we had ended up with an event or two in the pipeline that was bad. It would be really nice if there was some way to choose what to do when a bad event was found -- rather than letting the pipeline fill up quickly. Ie..
a) Dump the event to a data file and throw a warning in the log messages?
b) Throw the event away
c) Move the event to an alternate channel where it can be handled differently
Anything other than "stop pulling data from the channel and let the channel fill"
On May 22, 2013, at 12:39 AM, Mike Percy <[EMAIL PROTECTED]> wrote:
> Hi Matt,
> Nope, there is currently no way to do that. But you could use the timestamp interceptor to make sure your events always have those headers.
> On Mon, May 13, 2013 at 12:13 PM, Matt Wise <[EMAIL PROTECTED]> wrote:
> Great, thats working.. thank you. Is there a way to give the HDFS plugin a 'failsafe' path to write messages to when they are missing that kind of data?
> On May 10, 2013, at 6:30 PM, Mike Percy <[EMAIL PROTECTED]> wrote:
> > Hook up a HDFS sink to them that doesn't use %Y, %m, etc in the configured path.
> > HTH,
> > Mike
> > On May 10, 2013, at 11:00 AM, Matt Wise <[EMAIL PROTECTED]> wrote:
> >> Eek, this was worse than I thought. Turns out message continued to be added to the channels, but no transactions could complete to take messages out of the channel. I've moved the file channels out of the way and restarted the service for now ... but how can I recover the rest of the data in these filechannels?
> >> On May 10, 2013, at 10:29 AM, Matt Wise <[EMAIL PROTECTED]> wrote:
> >>> We were messing around with a few settings today and ended up getting a few messages into our channel that are bad (corrupt time field). How can I clear them out?
> >>>> 10 May 2013 17:28:26,920 ERROR [SinkRunner-PollingRunner-DefaultSinkProcessor] (org.apache.flume.SinkRunner$PollingRunner.run:160) - Unable to deliver event. Exception follows.
> >>>> org.apache.flume.EventDeliveryException: java.lang.RuntimeException: Flume wasn't able to parse timestamp header in the event to resolve time based bucketing. Please check that you're correctly populating timestamp header (for example using TimestampInterceptor source interceptor).
> >>>> at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:461)
> >>>> at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
> >>>> at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
> >>>> at java.lang.Thread.run(Thread.java:679)
> >>>> Caused by: java.lang.RuntimeException: Flume wasn't able to parse timestamp header in the event to resolve time based bucketing. Please check that you're correctly populating timestamp header (for example using TimestampInterceptor source interceptor).
> >>>> at org.apache.flume.formatter.output.BucketPath.replaceShorthand(BucketPath.java:160)
> >>>> at org.apache.flume.formatter.output.BucketPath.escapeString(BucketPath.java:343)
> >>>> at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:392)
> >>>> ... 3 more
> >>>> Caused by: java.lang.NumberFormatException: null
> >>>> at java.lang.Long.parseLong(Long.java:401)
> >>>> at java.lang.Long.valueOf(Long.java:535)
> >>>> at org.apache.flume.formatter.output.BucketPath.replaceShorthand(BucketPath.java:158)
> >>>> ... 5 more
> >>> This message just keeps repeating over and over again.. new events are coming through just fine.
Mike Percy 2013-05-23, 18:45