|
|
-
Re: FLUME AVROMike Percy 2012-08-13, 00:50
Mohit,
For historical reasons, the default fileType for HDFS sink is SequenceFile. If you want Avro container format, then you must use fileType = DataStream and use an event serializer that supports Avro, such as AVRO_EVENT. See the user guide for the HDFS sink config options: http://flume.apache.org/FlumeUserGuide.html#hdfs-sink BTW, the AVRO_EVENT event serializer has some of its own options to control compression, sync interval, etc. which are unfortunately not documented but you can find them in this file: https://git-wip-us.apache.org/repos/asf?p=flume.git;a=blob;f=flume-ng-core/src/main/java/org/apache/flume/serialization/AvroEventSerializerConfigurationConstants.java;h=cce67166f270bc7e4134f4aa577e1a01e88d409d;hb=trunk They are syncIntervalBytes and compressionCodec Regards, Mike On Sun, Aug 12, 2012 at 5:34 PM, Mohit Anchlia <[EMAIL PROTECTED]>wrote: > > > On Sun, Aug 12, 2012 at 5:29 PM, Mike Percy <[EMAIL PROTECTED]> wrote: > >> Assuming you are writing to a file or HDFS then look at the >> EventSerializer interface - there is an abstract class that implements that >> interface which you can use for writing Avro. >> >> >> https://cwiki.apache.org/confluence/display/FLUME/Flume+1.x+Event+Serializers >> >> http://flume.apache.org/releases/content/1.2.0/apidocs/org/apache/flume/serialization/EventSerializer.html >> >> This is an out-of-the-box Avro serializer that ships with Flume (its >> alias is AVRO_EVENT): >> >> http://flume.apache.org/releases/content/1.2.0/apidocs/org/apache/flume/serialization/FlumeEventAvroEventSerializer.html >> >> If you want to use your own Avro schema then you can just implement this >> abstract class and override the convert() method: >> >> http://flume.apache.org/releases/content/1.2.0/apidocs/org/apache/flume/serialization/AbstractAvroEventSerializer.html >> >> > > Is the data that is writen in the hdfs file in avro format and uses avro > datafile? From what I understand data that is writen in HDFS is not in Avro > format and goes in the sequence file. > >> Regards, >> Mike >> >> >> On Sun, Aug 12, 2012 at 6:15 AM, Harsh J <[EMAIL PROTECTED]> wrote: >> >>> Abhishek, >>> >>> Moving this to user@flume lists, as it is Flume specific. >>> >>> P.s. Please do not cross post to multiple lists, it does not guarantee >>> you a faster response nor is mailing to a *-dev list relevant to your >>> question here. Help avoid additional inbox noise! :) >>> >>> On Thu, Aug 9, 2012 at 10:43 PM, abhiTowson cal >>> <[EMAIL PROTECTED]> wrote: >>> > hi all, >>> > >>> > can log data be converted into avro,when data is sent from source to >>> sink. >>> > >>> > Regards >>> > Abhishek >>> >>> >>> >>> -- >>> Harsh J >>> >> >> > |