Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # user >> data saved in hdfs by flume ng is garbled


+
Hao Jian 2012-09-12, 06:58
Copy link to this message
-
Re: data saved in hdfs by flume ng is garbled
Hi,

The log4J appender doesn't pay any attention to the PatternLayout, at least
up through 1.3.0-SNAPSHOT.  That's why you only see %m and nothing else.

As for the garbled text, your property of this:

agent.sinks.hdfsSink.hdfs.file.Type

should be this:

agent.sinks.hdfsSink.hdfs.fileType

Since you had file.Type, Flume was using the default of SequenceFile.

Hope that helps!
Chris

On Wed, Sep 12, 2012 at 1:58 AM, Hao Jian <[EMAIL PROTECTED]> wrote:

> **
>  Hi,
> I'm using flume ng(1.2.0) to collect logs from log4j and save logs to hdfs
> . There are two problems:
> (1) Flume only collect %m in log4j, but not %d, %p, %t ...
> (2) The log saved in hdfs is garbled, not plain text.
>  My log4j configuration is as fowllows:
>  <appender name="Console" class="org.apache.log4j.ConsoleAppender">
> <layout class="org.apache.log4j.PatternLayout">
> <param name="ConversionPattern"
> value="%d >> %-5p >> %t >> %l >> %m%n"/>
> </layout>
> </appender>
> <appender name="flume"
> class="org.apache.flume.clients.log4jappender.Log4jAppender">
> <param name="Hostname" value="10.4.46.125" />
> <param name="Port" value="44444" />
> <layout class="org.apache.log4j.PatternLayout">
> <param name="ConversionPattern"
> value="%d >> %-5p >> %t >> %l >> %m%n"/>
> </layout>
> </appender>
>  <root>
> <level value="info"/>
> <appender-ref ref="Console"/>
> <appender-ref ref="flume"/>
> </root>
>  The console print is:
>  2012-09-12 11:43:37,391 >> INFO >> main >> Main.main(Main.java:22) >>
> this is test1
> 2012-09-12 11:43:37,454 >> INFO >> main >> Main.main(Main.java:22) >> this
> is test2
> 2012-09-12 11:43:37,460 >> INFO >> main >> Main.main(Main.java:22) >> this
> is test3
> 2012-09-12 11:43:37,465 >> INFO >> main >> Main.main(Main.java:22) >> this
> is test4
> 2012-09-12 11:43:37,470 >> INFO >> main >> Main.main(Main.java:22) >> this
> is test5
> 2012-09-12 11:43:37,475 >> INFO >> main >> Main.main(Main.java:22) >> this
> is test6
> 2012-09-12 11:43:37,480 >> INFO >> main >> Main.main(Main.java:22) >> this
> is test7
> 2012-09-12 11:43:37,485 >> INFO >> main >> Main.main(Main.java:22) >> this
> is test8
> 2012-09-12 11:43:37,492 >> INFO >> main >> Main.main(Main.java:22) >> this
> is test9
> 2012-09-12 11:43:37,497 >> INFO >> main >> Main.main(Main.java:22) >> this
> is test10
>  The flume configuration is:
>  agent.channels = jdbcChannel memChannel
> agent.sinks = hdfsSink fileSink
>  agent.channels.jdbcChannel.type = jdbc
> agent.channels.memChannel.type = memory
>  agent.sources.avroSrc.type = avro
>  agent.sources.avroSrc.bind = 0.0.0.0
> agent.sources.avroSrc.port = 40000
> agent.sources.avroSrc.channels = jdbcChannel memChannel
>  agent.sinks.hdfsSink.channel = jdbcChannel
> agent.sinks.hdfsSink.type = hdfs
>  agent.sinks.hdfsSink.hdfs.path = hdfs://10.4.44.134/flume/events/
> agent.sinks.hdfsSink.hdfs.rollCount = 0
> agent.sinks.hdfsSink.hdfs.writeFormat = Writable
> agent.sinks.hdfsSink.hdfs.file.Type = DataStream
>  agent.sinks.fileSink.channel = memChannel
> agent.sinks.fileSink.type = FILE_ROLL
> agent.sinks.fileSink.sink.directory = /var/log/flume/
> agent.sinks.fileSink.sink.rollInterval = 60
>  The log saved in /var/log/flume is:
>  this is test1
> this is test2
> this is test3
> this is test4
> this is test5
> this is test6
> this is test7
> this is test8
> this is test9
> this is test10
>  The log save in hdfs is:
>  SEQ
> !org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable�������
> �
> �D� ���# �cQ��� ��� �� 9���K���
> this is test1��� ��� �� 9���m���
> this is test2��� ��� �� 9������
> this is test3��� ��� �� 9������
> this is test4��� ��� �� 9������
> this is test5��� ��� �� 9�������
> this is test6��� ��� �� 9�������
> this is test7��� ��� �� 9�������
> this is test8��� ��� �� 9�������
> this is test9��� ��� �� 9������� this is test10
>  or:
>
> SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd
>
> \ufffdD\ufffd\ufffd\ufffd\ufffd#\ufffdcQ\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd9\ufffd\ufffd\ufffdK\ufffd\ufffd\ufffd
+
Hao Jian 2012-09-13, 01:10