Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> Problems with time variables in HDFS path


Copy link to this message
-
Re: Problems with time variables in HDFS path
The time variables depend on the existence of a header with the key
"timestamp". If it isn't there, it tries to parse a non-existent header
to calculate the time, and this happens. I don't believe it has anything
to do with the contents of your log message.

For the easiest way to add the header, I would recommend trying 1.2.0 as
soon as it is released(or you can try grabbing the current release
candidate or even the 1.3.0 trunk which I'm running right now without
any serious issues), and using the TimestampInterceptor there. As this
is a frequent query I've made a jira to document this dependency
properly https://issues.apache.org/jira/browse/FLUME-1364

On 07/11/2012 06:41 PM, Christian Schroer wrote:
> Hi,
>
> we are running into a strange problem using Flume-NG 1.10 from CDH 4.0.1.
>
> Setup:
> Flume-NG opens a TCP syslog port, collects all messages and forwards them directly into HDFS. This works fine until the point where we want to forward MS IIS Logs in W3C format. The reason seems to be a " - " inside the log message. I could reproduce the problem using rsyslogd forwarding all syslog messages to flume:
>
> logger "Hello this is a test" => Works fine :)
>
> logger "hello - this will break" => breaks flume :(
>
> If I remove the time variables from the HDFS path in our configuration (attached) everything is working fine...
>
> Exception:
>
> 2012-07-11 11:08:18,292 ERROR hdfs.HDFSEventSink: process failed
> java.lang.NumberFormatException: null
>          at java.lang.Long.parseLong(Long.java:375)
>          at java.lang.Long.valueOf(Long.java:525)
>          at org.apache.flume.formatter.output.BucketPath.replaceShorthand(BucketPath.java:220)
>          at org.apache.flume.formatter.output.BucketPath.escapeString(BucketPath.java:310)
>          at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:402)
>          at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
>          at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
>          at java.lang.Thread.run(Thread.java:662)
> 2012-07-11 11:08:18,294 ERROR flume.SinkRunner: Unable to deliver event. Exception follows.
> org.apache.flume.EventDeliveryException: java.lang.NumberFormatException: null
>          at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:469)
>          at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
>          at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
>          at java.lang.Thread.run(Thread.java:662)
> Caused by: java.lang.NumberFormatException: null
>          at java.lang.Long.parseLong(Long.java:375)
>          at java.lang.Long.valueOf(Long.java:525)
>          at org.apache.flume.formatter.output.BucketPath.replaceShorthand(BucketPath.java:220)
>          at org.apache.flume.formatter.output.BucketPath.escapeString(BucketPath.java:310)
>          at org.apache.flume.sink.hdfs.HDFSEventSink.process(HDFSEventSink.java:402)
>          ... 3 more
>
> I attached our configuration in case something is broken in there.
>
> Best regards,
>
> Christian Schroer
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB