|
周梦想
2013-02-19, 03:09
hoo.smth
2013-02-19, 03:14
Hari Shreedharan
2013-02-19, 03:17
周梦想
2013-02-19, 03:43
周梦想
2013-02-19, 03:50
Hari Shreedharan
2013-02-19, 03:51
周梦想
2013-02-19, 04:15
Hari Shreedharan
2013-02-19, 05:09
周梦想
2013-02-19, 06:12
|
-
strange flume hdfs put周梦想 2013-02-19, 03:09
hello,
I put some data to hdfs via flume 1.3.1,but it changed! source data: [zhouhh@Hadoop47 ~]$ echo "<13>Mon Feb 18 18:25:26 2013 hello world zhh " | nc -v hadoop48 5140 Connection to hadoop48 5140 port [tcp/*] succeeded! the flume agent received: 13/02/19 10:43:46 INFO hdfs.BucketWriter: Creating hdfs://Hadoop48:54310/flume//FlumeData.1361241606972.tmp 13/02/19 10:44:16 INFO hdfs.BucketWriter: Renaming hdfs://Hadoop48:54310/flume/FlumeData.1361241606972.tmp to hdfs://Hadoop48:54310/flume/FlumeData.1361241606972 the content in hdfs: [zhouhh@Hadoop47 ~]$ hadoop fs -cat hdfs://Hadoop48:54310/flume/FlumeData.1361241606972 SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable▒.FI▒Z▒Q{2▒,\<▒U▒Y)Mon Feb 18 18:25:26 2013 hello world zhh [zhouhh@Hadoop47 ~]$ I don't know why there is some data like "org.apache.hadoop.io.LongWritable",there are some bugs? Best Regards, Andy
-
Re: strange flume hdfs puthoo.smth 2013-02-19, 03:14
你修改下hdfs sink的writeFormat即可。
On Tue, Feb 19, 2013 at 11:09 AM, 周梦想 <[EMAIL PROTECTED]> wrote: > hello, > I put some data to hdfs via flume 1.3.1,but it changed! > > source data: > [zhouhh@Hadoop47 ~]$ echo "<13>Mon Feb 18 18:25:26 2013 hello world zhh > " | nc -v hadoop48 5140 > Connection to hadoop48 5140 port [tcp/*] succeeded! > > the flume agent received: > 13/02/19 10:43:46 INFO hdfs.BucketWriter: Creating > hdfs://Hadoop48:54310/flume//FlumeData.1361241606972.tmp > 13/02/19 10:44:16 INFO hdfs.BucketWriter: Renaming > hdfs://Hadoop48:54310/flume/FlumeData.1361241606972.tmp to > hdfs://Hadoop48:54310/flume/FlumeData.1361241606972 > > the content in hdfs: > > [zhouhh@Hadoop47 ~]$ hadoop fs -cat > hdfs://Hadoop48:54310/flume/FlumeData.1361241606972 > SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable▒.FI▒Z▒Q{2▒,\<▒U▒Y)Mon > Feb 18 18:25:26 2013 hello world zhh > [zhouhh@Hadoop47 ~]$ > > I don't know why there is some data like > "org.apache.hadoop.io.LongWritable",there are some bugs? > > Best Regards, > Andy > >
-
Re: strange flume hdfs putHari Shreedharan 2013-02-19, 03:17
This is because the data is written out by default in Hadoop's SequenceFile format. Use the DataStream file format (as in the Flume docs) to get the event parsed as is (if you use the default serializer, the headers will not be serialized, do make sure you select the correct serializer).
Hari -- Hari Shreedharan On Monday, February 18, 2013 at 7:09 PM, 周梦想 wrote: > hello, > I put some data to hdfs via flume 1.3.1,but it changed! > > source data: > [zhouhh@Hadoop47 ~]$ echo "<13>Mon Feb 18 18:25:26 2013 hello world zhh " | nc -v hadoop48 5140 > Connection to hadoop48 5140 port [tcp/*] succeeded! > > > the flume agent received: > 13/02/19 10:43:46 INFO hdfs.BucketWriter: Creating hdfs://Hadoop48:54310/flume//FlumeData.1361241606972.tmp > 13/02/19 10:44:16 INFO hdfs.BucketWriter: Renaming hdfs://Hadoop48:54310/flume/FlumeData.1361241606972.tmp to hdfs://Hadoop48:54310/flume/FlumeData.1361241606972 > > > the content in hdfs: > > [zhouhh@Hadoop47 ~]$ hadoop fs -cat hdfs://Hadoop48:54310/flume/FlumeData.1361241606972 > SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable▒.FI▒Z▒Q{2▒,\<▒U▒Y)Mon Feb 18 18:25:26 2013 hello world zhh > [zhouhh@Hadoop47 ~]$ > > > I don't know why there is some data like "org.apache.hadoop.io.LongWritable",there are some bugs? > > Best Regards, > Andy >
-
Re: strange flume hdfs put周梦想 2013-02-19, 03:43
hello,
I change the conf file like this: [zhouhh@Hadoop48 flume1.3.1]$ cat conf/testhdfs.conf syslog-agent.sources = Syslog syslog-agent.channels = MemoryChannel-1 syslog-agent.sinks = HDFS-LAB syslog-agent.sources.Syslog.type = syslogTcp syslog-agent.sources.Syslog.port = 5140 syslog-agent.sources.Syslog.channels = MemoryChannel-1 syslog-agent.sinks.HDFS-LAB.channel = MemoryChannel-1 syslog-agent.sinks.HDFS-LAB.type = hdfs syslog-agent.sinks.HDFS-LAB.hdfs.path = hdfs://Hadoop48:54310/flume/%{host} syslog-agent.sinks.HDFS-LAB.hdfs.file.Prefix = syslogfiles syslog-agent.sinks.HDFS-LAB.hdfs.file.rollInterval = 60 #syslog-agent.sinks.HDFS-LAB.hdfs.file.Type = SequenceFile #syslog-agent.sinks.HDFS-LAB.hdfs.file.Type = DataStream #syslog-agent.sinks.HDFS-LAB.hdfs.file.writeFormat= Text syslog-agent.channels.MemoryChannel-1.type = memory and I test again: [zhouhh@Hadoop47 ~]$ echo "<13>Mon Feb 18 18:25:26 2013 hello world zhh " | nc -v hadoop48 5140 Connection to hadoop48 5140 port [tcp/*] succeeded! [zhouhh@Hadoop47 ~]$ hadoop fs -cat hdfs://Hadoop48:54310/flume//FlumeData.1361245092567.tmp SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable▒▒▒ʣ g▒▒C%< <▒▒)Mon Feb 18 18:25:26 2013 hello world zhh [zhouhh@Hadoop47 ~]$ there still some text seems error. Andy 2013/2/19 Hari Shreedharan <[EMAIL PROTECTED]> > This is because the data is written out by default in Hadoop's > SequenceFile format. Use the DataStream file format (as in the Flume docs) > to get the event parsed as is (if you use the default serializer, the > headers will not be serialized, do make sure you select the correct > serializer). > > > Hari > > -- > Hari Shreedharan > > On Monday, February 18, 2013 at 7:09 PM, 周梦想 wrote: > > hello, > I put some data to hdfs via flume 1.3.1,but it changed! > > source data: > [zhouhh@Hadoop47 ~]$ echo "<13>Mon Feb 18 18:25:26 2013 hello world zhh > " | nc -v hadoop48 5140 > Connection to hadoop48 5140 port [tcp/*] succeeded! > > the flume agent received: > 13/02/19 10:43:46 INFO hdfs.BucketWriter: Creating > hdfs://Hadoop48:54310/flume//FlumeData.1361241606972.tmp > 13/02/19 10:44:16 INFO hdfs.BucketWriter: Renaming > hdfs://Hadoop48:54310/flume/FlumeData.1361241606972.tmp to > hdfs://Hadoop48:54310/flume/FlumeData.1361241606972 > > the content in hdfs: > > [zhouhh@Hadoop47 ~]$ hadoop fs -cat > hdfs://Hadoop48:54310/flume/FlumeData.1361241606972 > SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable▒.FI▒Z▒Q{2▒,\<▒U▒Y)Mon > Feb 18 18:25:26 2013 hello world zhh > [zhouhh@Hadoop47 ~]$ > > I don't know why there is some data like > "org.apache.hadoop.io.LongWritable",there are some bugs? > > Best Regards, > Andy > > >
-
Re: strange flume hdfs put周梦想 2013-02-19, 03:50
sorry,
syslog-agent.sinks.HDFS-LAB.hdfs.file.Type = DataStream this line doesn't have "#" result: [zhouhh@Hadoop47 ~]$ hadoop fs -cat hdfs://Hadoop48:54310/flume//FlumeData.1361245658255.tmp SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable▒뿱▒5▒_▒rU▒<▒\▒)Mon Feb 18 18:25:26 2013 hello world zhh 2013/2/19 周梦想 <[EMAIL PROTECTED]> > hello, > I change the conf file like this: > [zhouhh@Hadoop48 flume1.3.1]$ cat conf/testhdfs.conf > syslog-agent.sources = Syslog > syslog-agent.channels = MemoryChannel-1 > syslog-agent.sinks = HDFS-LAB > > syslog-agent.sources.Syslog.type = syslogTcp > syslog-agent.sources.Syslog.port = 5140 > > syslog-agent.sources.Syslog.channels = MemoryChannel-1 > syslog-agent.sinks.HDFS-LAB.channel = MemoryChannel-1 > > syslog-agent.sinks.HDFS-LAB.type = hdfs > > syslog-agent.sinks.HDFS-LAB.hdfs.path = hdfs://Hadoop48:54310/flume/%{host} > syslog-agent.sinks.HDFS-LAB.hdfs.file.Prefix = syslogfiles > syslog-agent.sinks.HDFS-LAB.hdfs.file.rollInterval = 60 > #syslog-agent.sinks.HDFS-LAB.hdfs.file.Type = SequenceFile > #syslog-agent.sinks.HDFS-LAB.hdfs.file.Type = DataStream > #syslog-agent.sinks.HDFS-LAB.hdfs.file.writeFormat= Text > syslog-agent.channels.MemoryChannel-1.type = memory > > and I test again: > [zhouhh@Hadoop47 ~]$ echo "<13>Mon Feb 18 18:25:26 2013 hello world zhh " > | nc -v hadoop48 5140 > Connection to hadoop48 5140 port [tcp/*] succeeded! > [zhouhh@Hadoop47 ~]$ hadoop fs -cat > hdfs://Hadoop48:54310/flume//FlumeData.1361245092567.tmp > > SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable▒▒▒ʣ > > g▒▒C%< <▒▒)Mon Feb 18 18:25:26 2013 hello world zhh [zhouhh@Hadoop47 ~]$ > > there still some text seems error. > > Andy > > 2013/2/19 Hari Shreedharan <[EMAIL PROTECTED]> > >> This is because the data is written out by default in Hadoop's >> SequenceFile format. Use the DataStream file format (as in the Flume docs) >> to get the event parsed as is (if you use the default serializer, the >> headers will not be serialized, do make sure you select the correct >> serializer). >> >> >> Hari >> >> -- >> Hari Shreedharan >> >> On Monday, February 18, 2013 at 7:09 PM, 周梦想 wrote: >> >> hello, >> I put some data to hdfs via flume 1.3.1,but it changed! >> >> source data: >> [zhouhh@Hadoop47 ~]$ echo "<13>Mon Feb 18 18:25:26 2013 hello world zhh >> " | nc -v hadoop48 5140 >> Connection to hadoop48 5140 port [tcp/*] succeeded! >> >> the flume agent received: >> 13/02/19 10:43:46 INFO hdfs.BucketWriter: Creating >> hdfs://Hadoop48:54310/flume//FlumeData.1361241606972.tmp >> 13/02/19 10:44:16 INFO hdfs.BucketWriter: Renaming >> hdfs://Hadoop48:54310/flume/FlumeData.1361241606972.tmp to >> hdfs://Hadoop48:54310/flume/FlumeData.1361241606972 >> >> the content in hdfs: >> >> [zhouhh@Hadoop47 ~]$ hadoop fs -cat >> hdfs://Hadoop48:54310/flume/FlumeData.1361241606972 >> SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable▒.FI▒Z▒Q{2▒,\<▒U▒Y)Mon >> Feb 18 18:25:26 2013 hello world zhh >> [zhouhh@Hadoop47 ~]$ >> >> I don't know why there is some data like >> "org.apache.hadoop.io.LongWritable",there are some bugs? >> >> Best Regards, >> Andy >> >> >> >
-
Re: strange flume hdfs putHari Shreedharan 2013-02-19, 03:51
See comment below.
-- Hari Shreedharan On Monday, February 18, 2013 at 7:43 PM, 周梦想 wrote: > hello, > I change the conf file like this: > [zhouhh@Hadoop48 flume1.3.1]$ cat conf/testhdfs.conf > syslog-agent.sources = Syslog > syslog-agent.channels = MemoryChannel-1 > syslog-agent.sinks = HDFS-LAB > > syslog-agent.sources.Syslog.type = syslogTcp > syslog-agent.sources.Syslog.port = 5140 > > syslog-agent.sources.Syslog.channels = MemoryChannel-1 > syslog-agent.sinks.HDFS-LAB.channel = MemoryChannel-1 > > syslog-agent.sinks.HDFS-LAB.type = hdfs > > syslog-agent.sinks.HDFS-LAB.hdfs.path = hdfs://Hadoop48:54310/flume/%{host} > syslog-agent.sinks.HDFS-LAB.hdfs.file.Prefix = syslogfiles > syslog-agent.sinks.HDFS-LAB.hdfs.file.rollInterval = 60 > #syslog-agent.sinks.HDFS-LAB.hdfs.file.Type = SequenceFile > #syslog-agent.sinks.HDFS-LAB.hdfs.file.Type = DataStream > > > You need to uncomment the above line and change it to: syslog-agent.sinks.HDFS-LAB.hdfs.fileType = DataStream > #syslog-agent.sinks.HDFS-LAB.hdfs.file.writeFormat= Text > syslog-agent.channels.MemoryChannel-1.type = memory > > and I test again: > [zhouhh@Hadoop47 ~]$ echo "<13>Mon Feb 18 18:25:26 2013 hello world zhh " | nc -v hadoop48 5140 > Connection to hadoop48 5140 port [tcp/*] succeeded! > [zhouhh@Hadoop47 ~]$ hadoop fs -cat hdfs://Hadoop48:54310/flume//FlumeData.1361245092567.tmp > SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable▒▒▒ʣ > g▒▒C%< <▒▒)Mon Feb 18 18:25:26 2013 hello world zhh [zhouhh@Hadoop47 ~]$ > > > there still some text seems error. > > Andy > 2013/2/19 Hari Shreedharan <[EMAIL PROTECTED] (mailto:[EMAIL PROTECTED])> > > This is because the data is written out by default in Hadoop's SequenceFile format. Use the DataStream file format (as in the Flume docs) to get the event parsed as is (if you use the default serializer, the headers will not be serialized, do make sure you select the correct serializer). > > > > > > Hari > > > > -- > > Hari Shreedharan > > > > > > On Monday, February 18, 2013 at 7:09 PM, 周梦想 wrote: > > > > > hello, > > > I put some data to hdfs via flume 1.3.1,but it changed! > > > > > > source data: > > > [zhouhh@Hadoop47 ~]$ echo "<13>Mon Feb 18 18:25:26 2013 hello world zhh " | nc -v hadoop48 5140 > > > Connection to hadoop48 5140 port [tcp/*] succeeded! > > > > > > > > > the flume agent received: > > > 13/02/19 10:43:46 INFO hdfs.BucketWriter: Creating hdfs://Hadoop48:54310/flume//FlumeData.1361241606972.tmp > > > 13/02/19 10:44:16 INFO hdfs.BucketWriter: Renaming hdfs://Hadoop48:54310/flume/FlumeData.1361241606972.tmp to hdfs://Hadoop48:54310/flume/FlumeData.1361241606972 > > > > > > > > > the content in hdfs: > > > > > > [zhouhh@Hadoop47 ~]$ hadoop fs -cat hdfs://Hadoop48:54310/flume/FlumeData.1361241606972 > > > SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable▒.FI▒Z▒Q{2▒,\<▒U▒Y)Mon Feb 18 18:25:26 2013 hello world zhh > > > [zhouhh@Hadoop47 ~]$ > > > > > > > > > I don't know why there is some data like "org.apache.hadoop.io.LongWritable",there are some bugs? > > > > > > Best Regards, > > > Andy > > > > > >
-
Re: strange flume hdfs put周梦想 2013-02-19, 04:15
yes,I changed the comment of that line, there is the same problem.
[zhouhh@Hadoop47 ~]$ hadoop fs -cat hdfs://Hadoop48:54310/flume//FlumeData.1361245658255.tmp SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable▒뿱▒5▒_▒rU▒<▒\▒)Mon Feb 18 18:25:26 2013 hello world zhh 2013/2/19 Hari Shreedharan <[EMAIL PROTECTED]> > See comment below. > > -- > Hari Shreedharan > > On Monday, February 18, 2013 at 7:43 PM, 周梦想 wrote: > > hello, > I change the conf file like this: > [zhouhh@Hadoop48 flume1.3.1]$ cat conf/testhdfs.conf > syslog-agent.sources = Syslog > syslog-agent.channels = MemoryChannel-1 > syslog-agent.sinks = HDFS-LAB > > syslog-agent.sources.Syslog.type = syslogTcp > syslog-agent.sources.Syslog.port = 5140 > > syslog-agent.sources.Syslog.channels = MemoryChannel-1 > syslog-agent.sinks.HDFS-LAB.channel = MemoryChannel-1 > > syslog-agent.sinks.HDFS-LAB.type = hdfs > > syslog-agent.sinks.HDFS-LAB.hdfs.path = hdfs://Hadoop48:54310/flume/%{host} > syslog-agent.sinks.HDFS-LAB.hdfs.file.Prefix = syslogfiles > syslog-agent.sinks.HDFS-LAB.hdfs.file.rollInterval = 60 > #syslog-agent.sinks.HDFS-LAB.hdfs.file.Type = SequenceFile > #syslog-agent.sinks.HDFS-LAB.hdfs.file.Type = DataStream > > You need to uncomment the above line and change it > to: syslog-agent.sinks.HDFS-LAB.hdfs.fileType = DataStream > > #syslog-agent.sinks.HDFS-LAB.hdfs.file.writeFormat= Text > syslog-agent.channels.MemoryChannel-1.type = memory > > and I test again: > [zhouhh@Hadoop47 ~]$ echo "<13>Mon Feb 18 18:25:26 2013 hello world zhh " > | nc -v hadoop48 5140 > Connection to hadoop48 5140 port [tcp/*] succeeded! > [zhouhh@Hadoop47 ~]$ hadoop fs -cat > hdfs://Hadoop48:54310/flume//FlumeData.1361245092567.tmp > > SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable▒▒▒ʣ > > g▒▒C%< <▒▒)Mon Feb 18 18:25:26 2013 hello world zhh [zhouhh@Hadoop47 ~]$ > > there still some text seems error. > > Andy > > 2013/2/19 Hari Shreedharan <[EMAIL PROTECTED]> > > This is because the data is written out by default in Hadoop's > SequenceFile format. Use the DataStream file format (as in the Flume docs) > to get the event parsed as is (if you use the default serializer, the > headers will not be serialized, do make sure you select the correct > serializer). > > > Hari > > -- > Hari Shreedharan > > On Monday, February 18, 2013 at 7:09 PM, 周梦想 wrote: > > hello, > I put some data to hdfs via flume 1.3.1,but it changed! > > source data: > [zhouhh@Hadoop47 ~]$ echo "<13>Mon Feb 18 18:25:26 2013 hello world zhh > " | nc -v hadoop48 5140 > Connection to hadoop48 5140 port [tcp/*] succeeded! > > the flume agent received: > 13/02/19 10:43:46 INFO hdfs.BucketWriter: Creating > hdfs://Hadoop48:54310/flume//FlumeData.1361241606972.tmp > 13/02/19 10:44:16 INFO hdfs.BucketWriter: Renaming > hdfs://Hadoop48:54310/flume/FlumeData.1361241606972.tmp to > hdfs://Hadoop48:54310/flume/FlumeData.1361241606972 > > the content in hdfs: > > [zhouhh@Hadoop47 ~]$ hadoop fs -cat > hdfs://Hadoop48:54310/flume/FlumeData.1361241606972 > SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable▒.FI▒Z▒Q{2▒,\<▒U▒Y)Mon > Feb 18 18:25:26 2013 hello world zhh > [zhouhh@Hadoop47 ~]$ > > I don't know why there is some data like > "org.apache.hadoop.io.LongWritable",there are some bugs? > > Best Regards, > Andy > > > > >
-
Re: strange flume hdfs putHari Shreedharan 2013-02-19, 05:09
Did you remove the "." between file and Type?
On Monday, February 18, 2013, 周梦想 wrote: > yes,I changed the comment of that line, there is the same problem. > > [zhouhh@Hadoop47 ~]$ hadoop fs -cat > hdfs://Hadoop48:54310/flume//FlumeData.1361245658255.tmp > SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable▒뿱▒5▒_▒rU▒<▒\▒)Mon > Feb 18 18:25:26 2013 hello world zhh > > 2013/2/19 Hari Shreedharan <[EMAIL PROTECTED]> > > See comment below. > > -- > Hari Shreedharan > > On Monday, February 18, 2013 at 7:43 PM, 周梦想 wrote: > > hello, > I change the conf file like this: > [zhouhh@Hadoop48 flume1.3.1]$ cat conf/testhdfs.conf > syslog-agent.sources = Syslog > syslog-agent.channels = MemoryChannel-1 > syslog-agent.sinks = HDFS-LAB > > syslog-agent.sources.Syslog.type = syslogTcp > syslog-agent.sources.Syslog.port = 5140 > > syslog-agent.sources.Syslog.channels = MemoryChannel-1 > syslog-agent.sinks.HDFS-LAB.channel = MemoryChannel-1 > > syslog-agent.sinks.HDFS-LAB.type = hdfs > > syslog-agent.sinks.HDFS-LAB.hdfs.path = hdfs://Hadoop48:54310/flume/%{host} > syslog-agent.sinks.HDFS-LAB.hdfs.file.Prefix = syslogfiles > syslog-agent.sinks.HDFS-LAB.hdfs.file.rollInterval = 60 > #syslog-agent.sinks.HDFS-LAB.hdfs.file.Type = SequenceFile > #syslog-agent.sinks.HDFS-LAB.hdfs.file.Type = DataStream > > You need to uncomment the above line and change it > to: syslog-agent.sinks.HDFS-LAB.hdfs.fileType = DataStream > > #syslog-agent.sinks.HDFS-LAB.hdfs.file.writeFormat= Text > syslog-agent.channels.MemoryChannel-1.type = memory > > and I test again: > [zhouhh@Hadoop47 ~]$ echo "<13>Mon Feb 18 18:25:26 2013 hello world zhh " > | nc -v hadoop48 5140 > Connection to hadoop48 5140 port [tcp/*] succeeded! > [zhouhh@Hadoop47 ~]$ hadoop fs -cat > hdfs://Hadoop48:54310/flume//FlumeData.1361245092567.tmp > > SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable▒▒▒ʣ > > g▒▒C%< <▒▒)Mon Feb 18 18:25:26 2013 hello world zhh [zhouhh@Hadoop47 ~]$ > > there still some text seems error. > > Andy > > 2013/2/19 Hari Shreedharan <[EMAIL PROTECTED]> > > This is because the data is written out by default in Hadoop's > SequenceFile format. Use the DataStream file format (as in the Flume docs) > to get the event parsed as is (if you use the default serializer, the > headers will not be serialized, do make sure you select the correct > serializer). > > > Hari > > -- > Hari Shreedharan > > On Monday, February 18, 2013 at 7:09 PM, 周梦想 wrote: > > hello, > I put some data to hdfs via flume 1.3.1,but it changed! > > source data: > [zhouhh@Hadoop47 ~]$ echo "<13>Mon Feb 18 18:25:26 2013 hello world zhh > " | nc -v hadoop48 5140 > Connection to hadoop48 5140 port [tcp/*] succeeded! > > the flume agent received: > 13/02/19 10:43:46 INFO hdfs.BucketWriter: Creating > hdfs://Hadoop48:54310/flume//FlumeData.1361241606972.tmp > 13/02/19 10:44:16 INFO hdfs.BucketWriter: Renaming > hdfs://Hadoop48:54310/flume/FlumeData.1361241606972.tmp to > hdfs://Hadoop48:54310/flume/FlumeData.1361241606972 > > the content in hdfs: > > [zhouhh@Hadoop47 ~]$ hadoop fs -cat > hdfs://Hadoop48:54310/flume/FlumeData.1361241606972 > SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable▒.FI▒Z▒Q{2▒,\<▒U▒Y)Mon > Feb 18 18:25:26 2013 hello world zhh > [zhouhh@Hadoop47 ~]$ > >
-
Re: strange flume hdfs put周梦想 2013-02-19, 06:12
thank you,Hari.
After remove the dot between file and Type, it's ok. [zhouhh@Hadoop47 ~]$ hadoop fs -cat hdfs://Hadoop48:54310/flume//FlumeData.1361254179075.tmp Mon Feb 18 18:25:26 2013 hello world zhh 2013/2/19 Hari Shreedharan <[EMAIL PROTECTED]> > Did you remove the "." between file and Type? > > > On Monday, February 18, 2013, 周梦想 wrote: > >> yes,I changed the comment of that line, there is the same problem. >> >> [zhouhh@Hadoop47 ~]$ hadoop fs -cat >> hdfs://Hadoop48:54310/flume//FlumeData.1361245658255.tmp >> SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable▒뿱▒5▒_▒rU▒<▒\▒)Mon >> Feb 18 18:25:26 2013 hello world zhh >> >> 2013/2/19 Hari Shreedharan <[EMAIL PROTECTED]> >> >> See comment below. >> >> -- >> Hari Shreedharan >> >> On Monday, February 18, 2013 at 7:43 PM, 周梦想 wrote: >> >> hello, >> I change the conf file like this: >> [zhouhh@Hadoop48 flume1.3.1]$ cat conf/testhdfs.conf >> syslog-agent.sources = Syslog >> syslog-agent.channels = MemoryChannel-1 >> syslog-agent.sinks = HDFS-LAB >> >> syslog-agent.sources.Syslog.type = syslogTcp >> syslog-agent.sources.Syslog.port = 5140 >> >> syslog-agent.sources.Syslog.channels = MemoryChannel-1 >> syslog-agent.sinks.HDFS-LAB.channel = MemoryChannel-1 >> >> syslog-agent.sinks.HDFS-LAB.type = hdfs >> >> syslog-agent.sinks.HDFS-LAB.hdfs.path >> hdfs://Hadoop48:54310/flume/%{host} >> syslog-agent.sinks.HDFS-LAB.hdfs.file.Prefix = syslogfiles >> syslog-agent.sinks.HDFS-LAB.hdfs.file.rollInterval = 60 >> #syslog-agent.sinks.HDFS-LAB.hdfs.file.Type = SequenceFile >> #syslog-agent.sinks.HDFS-LAB.hdfs.file.Type = DataStream >> >> You need to uncomment the above line and change it >> to: syslog-agent.sinks.HDFS-LAB.hdfs.fileType = DataStream >> >> #syslog-agent.sinks.HDFS-LAB.hdfs.file.writeFormat= Text >> syslog-agent.channels.MemoryChannel-1.type = memory >> >> and I test again: >> [zhouhh@Hadoop47 ~]$ echo "<13>Mon Feb 18 18:25:26 2013 hello world zhh >> " | nc -v hadoop48 5140 >> Connection to hadoop48 5140 port [tcp/*] succeeded! >> [zhouhh@Hadoop47 ~]$ hadoop fs -cat >> hdfs://Hadoop48:54310/flume//FlumeData.1361245092567.tmp >> >> SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable▒▒▒ʣ >> >> g▒▒C%< <▒▒)Mon Feb 18 18:25:26 2013 hello world zhh [zhouhh@Hadoop47~]$ >> >> there still some text seems error. >> >> Andy >> >> 2013/2/19 Hari Shreedharan <[EMAIL PROTECTED]> >> >> This is because the data is written out by default in Hadoop's >> SequenceFile format. Use the DataStream file format (as in the Flume docs) >> to get the event parsed as is (if you use the default serializer, the >> headers will not be serialized, do make sure you select the correct >> serializer). >> >> >> Hari >> >> -- >> Hari Shreedharan >> >> On Monday, February 18, 2013 at 7:09 PM, 周梦想 wrote: >> >> hello, >> I put some data to hdfs via flume 1.3.1,but it changed! >> >> source data: >> [zhouhh@Hadoop47 ~]$ echo "<13>Mon Feb 18 18:25:26 2013 hello world zhh >> " | nc -v hadoop48 5140 >> Connection to hadoop48 5140 port [tcp/*] succeeded! >> >> the flume agent received: >> 13/02/19 10:43:46 INFO hdfs.BucketWriter: Creating >> hdfs://Hadoop48:54310/flume//FlumeData.1361241606972.tmp >> 13/02/19 10:44:16 INFO hdfs.BucketWriter: Renaming >> hdfs://Hadoop48:54310/flume/FlumeData.1361241606972.tmp to >> hdfs://Hadoop48:54310/flume/FlumeData.1361241606972 >> >> the content in hdfs: >> >> [zhouhh@Hadoop47 ~]$ hadoop fs -cat >> hdfs://Hadoop48:54310/flume/FlumeData.1361241606972 >> SEQ!org.apache.hadoop.io.LongWritable"org.apache.hadoop.io.BytesWritable▒.FI▒Z▒Q{2▒,\<▒U▒Y)Mon >> Feb 18 18:25:26 2013 hello world zhh >> [zhouhh@Hadoop47 ~]$ >> >> |