Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> strange flume hdfs put


Copy link to this message
-
Re: strange flume hdfs put
thank you,Hari.
After remove the dot between file and Type, it's ok.
[zhouhh@Hadoop47 ~]$ hadoop fs -cat
hdfs://Hadoop48:54310/flume//FlumeData.1361254179075.tmp
Mon Feb 18 18:25:26 2013 hello world zhh
2013/2/19 Hari Shreedharan <[EMAIL PROTECTED]>

> Did you remove the "." between file and Type?
>
>
> On Monday, February 18, 2013, 周梦想 wrote:
>
>> yes,I changed the comment of that line, there is the same problem.
>>
>> [zhouhh@Hadoop47 ~]$ hadoop fs -cat
>> hdfs://Hadoop48:54310/flume//FlumeData.1361245658255.tmp
>> SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable▒뿱▒5▒_▒rU▒<▒\▒)Mon
>> Feb 18 18:25:26 2013 hello world zhh
>>
>> 2013/2/19 Hari Shreedharan <[EMAIL PROTECTED]>
>>
>>  See comment below.
>>
>> --
>> Hari Shreedharan
>>
>> On Monday, February 18, 2013 at 7:43 PM, 周梦想 wrote:
>>
>> hello,
>> I change the conf file like this:
>> [zhouhh@Hadoop48 flume1.3.1]$ cat conf/testhdfs.conf
>> syslog-agent.sources = Syslog
>> syslog-agent.channels = MemoryChannel-1
>> syslog-agent.sinks = HDFS-LAB
>>
>> syslog-agent.sources.Syslog.type = syslogTcp
>> syslog-agent.sources.Syslog.port = 5140
>>
>> syslog-agent.sources.Syslog.channels = MemoryChannel-1
>> syslog-agent.sinks.HDFS-LAB.channel = MemoryChannel-1
>>
>> syslog-agent.sinks.HDFS-LAB.type = hdfs
>>
>> syslog-agent.sinks.HDFS-LAB.hdfs.path >> hdfs://Hadoop48:54310/flume/%{host}
>> syslog-agent.sinks.HDFS-LAB.hdfs.file.Prefix = syslogfiles
>> syslog-agent.sinks.HDFS-LAB.hdfs.file.rollInterval = 60
>> #syslog-agent.sinks.HDFS-LAB.hdfs.file.Type = SequenceFile
>> #syslog-agent.sinks.HDFS-LAB.hdfs.file.Type = DataStream
>>
>> You need to uncomment the above line and change it
>> to: syslog-agent.sinks.HDFS-LAB.hdfs.fileType = DataStream
>>
>> #syslog-agent.sinks.HDFS-LAB.hdfs.file.writeFormat= Text
>> syslog-agent.channels.MemoryChannel-1.type = memory
>>
>> and I test again:
>> [zhouhh@Hadoop47 ~]$ echo "<13>Mon Feb 18 18:25:26 2013 hello world zhh
>> " | nc -v hadoop48 5140
>> Connection to hadoop48 5140 port [tcp/*] succeeded!
>> [zhouhh@Hadoop47 ~]$ hadoop fs -cat
>> hdfs://Hadoop48:54310/flume//FlumeData.1361245092567.tmp
>>
>> SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable▒▒▒ʣ
>>
>>   g▒▒C%< <▒▒)Mon Feb 18 18:25:26 2013 hello world zhh [zhouhh@Hadoop47~]$
>>
>> there still some text seems error.
>>
>> Andy
>>
>> 2013/2/19 Hari Shreedharan <[EMAIL PROTECTED]>
>>
>>  This is because the data is written out by default in Hadoop's
>> SequenceFile format. Use the DataStream file format (as in the Flume docs)
>> to get the event parsed as is (if you use the default serializer, the
>> headers will not be serialized, do make sure you select the correct
>> serializer).
>>
>>
>> Hari
>>
>> --
>> Hari Shreedharan
>>
>> On Monday, February 18, 2013 at 7:09 PM, 周梦想 wrote:
>>
>> hello,
>> I put some data to hdfs via flume 1.3.1,but it changed!
>>
>> source data:
>> [zhouhh@Hadoop47 ~]$  echo "<13>Mon Feb 18 18:25:26 2013 hello world zhh
>> " | nc -v hadoop48 5140
>> Connection to hadoop48 5140 port [tcp/*] succeeded!
>>
>> the flume agent received:
>> 13/02/19 10:43:46 INFO hdfs.BucketWriter: Creating
>> hdfs://Hadoop48:54310/flume//FlumeData.1361241606972.tmp
>> 13/02/19 10:44:16 INFO hdfs.BucketWriter: Renaming
>> hdfs://Hadoop48:54310/flume/FlumeData.1361241606972.tmp to
>> hdfs://Hadoop48:54310/flume/FlumeData.1361241606972
>>
>> the content in hdfs:
>>
>> [zhouhh@Hadoop47 ~]$ hadoop fs -cat
>>  hdfs://Hadoop48:54310/flume/FlumeData.1361241606972
>> SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable▒.FI▒Z▒Q{2▒,\<▒U▒Y)Mon
>> Feb 18 18:25:26 2013 hello world zhh
>> [zhouhh@Hadoop47 ~]$
>>
>>