Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> How to use LZO in Flume-ng


Copy link to this message
-
Re: How to use LZO in Flume-ng
hi Kevin,
    I applied for LZO successfully. I will post my LZO configuration, you
can compare the difference.

    1. agent.sinks.hdfsSin1.hdfs.codeC = com.hadoop.compression.lzo.LzoCodec
    2. Added this configuration at Hadoop core-site.xml
       <property>
            <name>io.compression.codecs</name>
            <value>com.hadoop.compression.lzo.LzoCodec</value>
       </property>

-Regards
Denny Ye
2012/8/28 Kevin Lee <[EMAIL PROTECTED]>

> Folks,
>
> I was follow this link Hadoop at Twitter (part 1): Splittable LZO
> Compression<http://www.cloudera.com/blog/2009/11/hadoop-at-twitter-part-1-splittable-lzo-compression/%5D> to
> integration LZO in Hadoop2.0, but seems Flume-ng lzo compress not work.
>
> My flume-ng configuratioin file is:
>
> cat > /tmp/flume-lzo.conf <<EOF
> agent.sources = lzo-avro-collect
> agent.channels = lzo-memory-channel
> agent.sinks = lzo-hdfs-write
>
> agent.sources.lzo-avro-collect.type = avro
> agent.sources.lzo-avro-collect.bind = 0.0.0.0
> agent.sources.lzo-avro-collect.port = 12345
> agent.sources.lzo-avro-collect.channels = lzo-memory-channel
> agent.channels.lzo-memory-channel.type = memory
> agent.channels.lzo-memory-channel.capacity = 1000000
> agent.channels.lzo-memory-channel.transactionCapacity = 10000
> agent.channels.lzo-memory-channel.stay-alive = 3
> agent.sinks.lzo-hdfs-write.type = hdfs
> agent.sinks.lzo-hdfs-write.hdfs.path = hdfs://10.34.4.55:8020/tmp/
> agent.sinks.lzo-hdfs-write.hdfs.filePrefix = test%Y
> agent.sinks.lzo-hdfs-write.channel = lzo-memory-channel
> agent.sinks.lzo-hdfs-write.hdfs.rollInterval = 3600
> agent.sinks.lzo-hdfs-write.hdfs.rollSize = 209715200
> agent.sinks.lzo-hdfs-write.hdfs.rollCount = 0
> agent.sinks.lzo-hdfs-write.hdfs.batchSize = 1000
> agent.sinks.lzo-hdfs-write.hdfs.codeC = lzo
> agent.sinks.lzo-hdfs-write.hdfs.fileType = CompressedStream
> EOF
>
> and i start flume-ng-agent on front
>
> sudo -u flume flume-ng agent -n agent -f /tmp/flume-lzo.conf
>
> using avro-client to ship the event.
>
> echo aaaaaaaaaaaaaaaaa > /tmp/events
> sudo -u flume flume-ng avro-client -H localhost -p 12345 -F /tmp/events
>
> the flume-ng-agent collector log as follow:
>
> 12/08/28 06:33:53 INFO lzo.GPLNativeCodeLoader: Loaded native gpl library
> 12/08/28 06:33:53 INFO lzo.LzoCodec: Successfully loaded & initialized native-lzo library [hadoop-lzo rev 6bb1b7f8b9044d8df9b4d2b6641db7658aab3cf8]
> 12/08/28 06:33:54 INFO hdfs.HDFSEventSink: Hadoop Security enabled: false
> 12/08/28 06:33:54 INFO nodemanager.DefaultLogicalNodeManager: Starting new configuration:{ sourceRunners:{lzo-avro-collect=EventDrivenSourceRunner: { source:AvroSource: { bindAddress:0.0.0.0 port:12345 } }} sinkRunners:{lzo-hdfs-write=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@39e57e8f counterGroup:{ name:null counters:{} } }} channels:{lzo-memory-channel=org.apache.flume.channel.MemoryChannel@9d7fbfb} }
> 12/08/28 06:33:54 INFO nodemanager.DefaultLogicalNodeManager: Starting Channel lzo-memory-channel
> 12/08/28 06:33:54 INFO nodemanager.DefaultLogicalNodeManager: Starting Sink lzo-hdfs-write
> 12/08/28 06:33:54 INFO nodemanager.DefaultLogicalNodeManager: Starting Source lzo-avro-collect
> 12/08/28 06:33:54 INFO source.AvroSource: Avro source starting:AvroSource: { bindAddress:0.0.0.0 port:12345 }
> 12/08/28 06:34:02 INFO ipc.NettyServer: [id: 0x651db6bb, /127.0.0.1:48085 => /127.0.0.1:12345] OPEN
> 12/08/28 06:34:02 INFO ipc.NettyServer: [id: 0x651db6bb, /127.0.0.1:48085 => /127.0.0.1:12345] BOUND: /127.0.0.1:12345
> 12/08/28 06:34:02 INFO ipc.NettyServer: [id: 0x651db6bb, /127.0.0.1:48085 => /127.0.0.1:12345] CONNECTED: /127.0.0.1:48085
> 12/08/28 06:34:02 INFO ipc.NettyServer: [id: 0x651db6bb, /127.0.0.1:48085 :> /127.0.0.1:12345] DISCONNECTED
> 12/08/28 06:34:02 INFO ipc.NettyServer: [id: 0x651db6bb, /127.0.0.1:48085 :> /127.0.0.1:12345] UNBOUND
> 12/08/28 06:34:02 INFO ipc.NettyServer: [id: 0x651db6bb, /127.0.0.1:48085 :> /127.0.0.1:12345] CLOSED
> 12/08/28 06:34:03 INFO hdfs.BucketWriter: Creating hdfs://10.34.4.55:8020/tmp//test.1346135643045.lzo_deflate.tmp