|
|
+
Harish Mandala 2012-09-24, 22:01
+
Harish Mandala 2012-09-25, 19:17
-
Re: HDFS Event Sink problemsMike Percy 2012-09-25, 21:00
Harish,
What did you find on your side? Could it be related to https://issues.apache.org/jira/browse/FLUME-1610 ? I am looking at that issue right now. Regards, Mike On Tue, Sep 25, 2012 at 12:17 PM, Harish Mandala <[EMAIL PROTECTED]>wrote: > Thanks, but I understood why this is happening. > > On Mon, Sep 24, 2012 at 6:01 PM, Harish Mandala <[EMAIL PROTECTED]>wrote: > >> Hello, >> >> >> I’m having some trouble with the HDFS Event Sink. I’m using the latest >> version of flume NG, checked out today. >> >> >> I am using curloader to hit “MycustomSource”, which essentially takes in >> HTTP messages, and splits the content into 2 “kinds” of flume events >> (differentiated by header key-value). The first kind is sent to hdfs-sink1, >> and the second kind to hdfs-sink2 by a multiplexing selector as outlined in >> the configuration below. There’s also an hdfs-sink3 which can be ignored at >> present. >> >> I can’t really understand what’s going on. It seems related to some of >> the race condition issues outlined here: >> >> https://issues.apache.org/jira/browse/FLUME-1219 >> >> >> Please let me know if you need more information. >> >> >> The following is my conf file. It is followed by flume.log. >> >> >> #### flume.conf #### >> >> agent1.channels = ch1 ch2 ch3 >> >> agent1.sources = mycustom-source1 >> >> agent1.sinks = hdfs-sink1 hdfs-sink2 hdfs-sink3 >> >> # Define a memory channel called ch1 on agent1 >> >> agent1.channels.ch1.type = memory >> >> agent1.channels.ch1.capacity = 200000 >> >> agent1.channels.ch1.transactionCapacity = 20000 >> >> agent1.channels.ch2.type = memory >> >> agent1.channels.ch2.capacity = 1000000 >> >> agent1.channels.ch2.transactionCapacity = 100000 >> >> agent1.channels.ch3.type = memory >> >> agent1.channels.ch3.capacity = 10000 >> >> agent1.channels.ch3.transactionCapacity = 5000 >> >> >> >> #agent1.channels.ch2.type = memory >> >> #agent1.channels.ch3.type = memory >> >> >> >> # Define an Mycustom custom source called mycustom-source1 on agent1 and >> tell it >> >> # to bind to 0.0.0.0:41414. Connect it to channel ch1. >> >> agent1.sources.mycustom-source1.channels = ch1 ch2 ch3 >> >> agent1.sources.mycustom-source1.type >> org.apache.flume.source.MycustomSource >> >> agent1.sources.mycustom-source1.bind = 127.0.0.1 >> >> agent1.sources.mycustom-source1.port = 1234 >> >> agent1.sources.mycustom-source1.serialization_method = json >> >> #agent1.sources.mycustom-source1.schema_filepath >> /home/ubuntu/Software/flume/trunk/conf/AvroEventSchema.avpr >> >> >> >> # Define an HDFS sink >> >> agent1.sinks.hdfs-sink1.channel = ch1 >> >> agent1.sinks.hdfs-sink1.type = hdfs >> >> agent1.sinks.hdfs-sink1.hdfs.path = hdfs://localhost:54310/user/flumeDump1 >> >> agent1.sinks.hdfs-sink1.hdfs.filePrefix = events >> >> agent1.sinks.hdfs-sink1.hdfs.batchSize = 20000 >> >> agent1.sinks.hdfs-sink1.hdfs.fileType = DataStream >> >> agent1.sinks.hdfs-sink1.hdfs.writeFormat = Text >> >> agent1.sinks.hdfs-sink1.hdfs.maxOpenFiles = 10000 >> >> agent1.sinks.hdfs-sink1.hdfs.rollSize = 0 >> >> agent1.sinks.hdfs-sink1.hdfs.rollInterval = 0 >> >> agent1.sinks.hdfs-sink1.hdfs.rollCount = 20000 >> >> agent1.sinks.hdfs-sink1.hdfs.hdfs.threadsPoolSize = 20 >> >> >> >> agent1.sinks.hdfs-sink2.channel = ch2 >> >> agent1.sinks.hdfs-sink2.type = hdfs >> >> agent1.sinks.hdfs-sink2.hdfs.path = hdfs://localhost:54310/user/flumeDump2 >> >> agent1.sinks.hdfs-sink2.hdfs.filePrefix = events >> >> agent1.sinks.hdfs-sink2.hdfs.batchSize = 100000 >> >> agent1.sinks.hdfs-sink2.hdfs.fileType = DataStream >> >> agent1.sinks.hdfs-sink2.hdfs.writeFormat = Text >> >> agent1.sinks.hdfs-sink2.hdfs.maxOpenFiles = 10000 >> >> agent1.sinks.hdfs-sink2.hdfs.rollSize = 0 >> >> agent1.sinks.hdfs-sink2.hdfs.rollInterval = 0 >> >> agent1.sinks.hdfs-sink2.hdfs.rollCount = 100000 >> >> agent1.sinks.hdfs-sink2.hdfs.hdfs.threadsPoolSize = 20 >> >> >> >> agent1.sinks.hdfs-sink3.channel = ch3 >> >> agent1.sinks.hdfs-sink3.type = hdfs >> >> agent1.sinks.hdfs-sink3.hdfs.path = hdfs://localhost:54310/user/flumeDump3 +
Harish Mandala 2012-09-26, 11:49
|