Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> strange flume hdfs put


Copy link to this message
-
Re: strange flume hdfs put
hello,
I change the conf file like this:
[zhouhh@Hadoop48 flume1.3.1]$ cat conf/testhdfs.conf
syslog-agent.sources = Syslog
syslog-agent.channels = MemoryChannel-1
syslog-agent.sinks = HDFS-LAB

syslog-agent.sources.Syslog.type = syslogTcp
syslog-agent.sources.Syslog.port = 5140

syslog-agent.sources.Syslog.channels = MemoryChannel-1
syslog-agent.sinks.HDFS-LAB.channel = MemoryChannel-1

syslog-agent.sinks.HDFS-LAB.type = hdfs

syslog-agent.sinks.HDFS-LAB.hdfs.path = hdfs://Hadoop48:54310/flume/%{host}
syslog-agent.sinks.HDFS-LAB.hdfs.file.Prefix = syslogfiles
syslog-agent.sinks.HDFS-LAB.hdfs.file.rollInterval = 60
#syslog-agent.sinks.HDFS-LAB.hdfs.file.Type = SequenceFile
#syslog-agent.sinks.HDFS-LAB.hdfs.file.Type = DataStream
#syslog-agent.sinks.HDFS-LAB.hdfs.file.writeFormat= Text
syslog-agent.channels.MemoryChannel-1.type = memory

and I test again:
[zhouhh@Hadoop47 ~]$ echo "<13>Mon Feb 18 18:25:26 2013 hello world zhh " |
nc -v hadoop48 5140
Connection to hadoop48 5140 port [tcp/*] succeeded!
[zhouhh@Hadoop47 ~]$ hadoop fs -cat
hdfs://Hadoop48:54310/flume//FlumeData.1361245092567.tmp
SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable▒▒▒ʣ

g▒▒C%< <▒▒)Mon Feb 18 18:25:26 2013 hello world zhh [zhouhh@Hadoop47 ~]$

there still some text seems error.

Andy

2013/2/19 Hari Shreedharan <[EMAIL PROTECTED]>

>  This is because the data is written out by default in Hadoop's
> SequenceFile format. Use the DataStream file format (as in the Flume docs)
> to get the event parsed as is (if you use the default serializer, the
> headers will not be serialized, do make sure you select the correct
> serializer).
>
>
> Hari
>
> --
> Hari Shreedharan
>
> On Monday, February 18, 2013 at 7:09 PM, 周梦想 wrote:
>
> hello,
> I put some data to hdfs via flume 1.3.1,but it changed!
>
> source data:
> [zhouhh@Hadoop47 ~]$  echo "<13>Mon Feb 18 18:25:26 2013 hello world zhh
> " | nc -v hadoop48 5140
> Connection to hadoop48 5140 port [tcp/*] succeeded!
>
> the flume agent received:
> 13/02/19 10:43:46 INFO hdfs.BucketWriter: Creating
> hdfs://Hadoop48:54310/flume//FlumeData.1361241606972.tmp
> 13/02/19 10:44:16 INFO hdfs.BucketWriter: Renaming
> hdfs://Hadoop48:54310/flume/FlumeData.1361241606972.tmp to
> hdfs://Hadoop48:54310/flume/FlumeData.1361241606972
>
> the content in hdfs:
>
> [zhouhh@Hadoop47 ~]$ hadoop fs -cat
>  hdfs://Hadoop48:54310/flume/FlumeData.1361241606972
> SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable▒.FI▒Z▒Q{2▒,\<▒U▒Y)Mon
> Feb 18 18:25:26 2013 hello world zhh
> [zhouhh@Hadoop47 ~]$
>
> I don't know why there is some data like
> "org.apache.hadoop.io.LongWritable",there are some bugs?
>
> Best Regards,
> Andy
>
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB