Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> flume to HDFS log event write


Copy link to this message
-
RE: flume to HDFS log event write
The expected output I pasted is from file only which I can see in file but while writing to HDFS its giving some junk value and why I am not able to see timestamp and other log information

From: Bertrand Dechoux [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, January 09, 2013 3:39 PM
To: [EMAIL PROTECTED]
Subject: Re: flume to HDFS log event write

http://hadoop.apache.org/docs/current/api/org/apache/hadoop/io/SequenceFile.html

is a binary format. You may want to make flume ouput to a file or the console first.
And then compare what you are expecting versus what you are getting.

Regards

Bertrand
On Wed, Jan 9, 2013 at 11:02 AM, Chhaya Vishwakarma <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
hi,

I am using Flume log4j appender to write log events to HDFS but it contains some junk value and I am not able to see anything other than log message no timestamp.

Here is my configuration
Log4j.properties

log4j.logger.log4jExample= DEBUG,out2
log4j.appender.out2 = org.apache.flume.clients.log4jappender.Log4jAppender
log4j.appender.out2.Port = 41414
log4j.appender.out2.Hostname = 172.20.104.223

here is agent configuration
a1.sources = r1
a1.sinks = k1
a1.channels = c1

#sources
a1.sources.r1.type = avro
a1.sources.r1.bind =172.20.104.226
a1.sources.r1.port= 41414
a1.sources.r1.restart =true
a1.sources.r1.batchsize=10000

# Describe the sink
a1.sinks.k1.type = hdfs
a1.sinks.k1.hdfs.path=hdfs://172.20.104.226:8020/flumeinput/%{host}<http://172.20.104.226:8020/flumeinput/%25%7bhost%7d>
a1.sinks.k1.hdfs.file.Type=DataStream
a1.sinks.k1.hdfs.writeFormat=Writable
a1.sinks.k1.hdfs.rollCount=10000
a1.sinks.k1.serializer=TEXT

# Use a channel which buffers events in memory
a1.channels.c1.type = file
a1.channels.c1.capacity = 10000
a1.channels.c1.transactionCapacity = 10000

# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

Expected output
[2013-01-09 15:15:45,457] - [main] DEBUG log4jExample Current data unavailalbe, using cached values
[2013-01-09 15:15:45,458] - [main] INFO  log4jExample Hello this is an info message
[2013-01-09 15:15:45,460] - [main] ERROR log4jExample Dabase unavaliable, connetion lost
[2013-01-09 15:15:45,461] - [main] WARN  log4jExample Attention!! Application running in debugmode
[2013-01-09 15:15:45,463] - [main] DEBUG log4jExample Current data unavailalbe, using cached values
[2013-01-09 15:15:45,465] - [main] INFO  log4jExample Hello this is an info message
[2013-01-09 15:15:45,467] - [main] ERROR log4jExample Dabase unavaliable, connetion lost
[2013-01-09 15:15:45,468] - [main] WARN  log4jExample Attention!! Application running in debugmode
[2013-01-09 15:15:45,470] - [main] DEBUG log4jExample Current data unavailalbe, using cached values

But getting this
Output on HDFS
SEQ#6;!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable������+�#19;AE����9#8;<‑��-Current data unavailalbe, using cached values)#8;<‑��Hello this is an info message.#8;<‑��"Dabase unavaliable, connetion lost8#8;<‑��,Attention!! Application running in debugmode9#8;<‑��-Current data unavailalbe, using cached values)#8;<‑��Hello this is an info message.#8;<‑��"Dabase unavaliable, connetion lost8#8;<‑��#28;,Attention!! Application running in debugmode9#8;<‑��-Current data unavailalbe, using cached values)#8;<‑��‑Hello this is an info message.#8;<‑��‑"Dabase unavaliable, connetion lost8#8;<‑��­,Attention!! Application running in debugmode9#8;<‑�� -Current data unavailalbe, using cached values)#8;<‑�� Hello this is an info message.#8;<‑��!"Dabase unavaliable, connetion lost8#8;<‑��",Attention!! Application running in debugmode9#8;<‑��"-Current data unavailalbe, using cached values)#8;<‑��#Hello this is an info message.#8;<‑��#"Dabase unavaliable, connetion lost8#8;<‑��$,Attention!! Application running in debugmode9#8;<‑��$-Current data unavailalbe, using cached values)#8;<‑��%Hello this is an info message.<‑��%"Dabase unavaliable, connetion lost8#8;<‑��%,Attention!! Application running in debugmode9#8;<‑��&-Current data unavailalbe, using cached values)#8;<‑��&Hello this is an info message.#8;<‑��'"Dabase unavaliable, connetion lost8#8;<‑��(,Attention!! Application running in debugmode

________________________________
The contents of this e-mail and any attachment(s) may contain confidential or privileged information for the intended recipient(s). Unintended recipients are prohibited from taking action on the basis of information in this e-mail and using or disseminating the information, and must notify the sender and delete it from their system. L&T Infotech will not accept responsibility or liability for the accuracy or completeness of, or the presence of any virus or disabling code in this e-mail"

Bertrand Dechoux
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB