Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> HDFSsink failover error


Copy link to this message
-
Re: HDFSsink failover error
Here is the full config. I swapped the priorities on the sink processor after performing the namenode failiver and the writes were successful to the newly active namenode.

agent1.channels.ch1.type = FILE
agent1.channels.ch1.checkpointDir = /flume_runtime/checkpoint
agent1.channels.ch1.dataDirs = /flume_runtime/data
agent1.channels.ch2.type = FILE
agent1.channels.ch2.checkpointDir = /flume_runtime/checkpoint2
agent1.channels.ch2.dataDirs = /flume_runtime/data2

# Define an Avro source called avro-source1 on agent1 and tell it

# to bind to 0.0.0.0:41414. Connect it to channel ch1.

agent1.sources.avro-source1.channels = ch1
agent1.sources.avro-source1.type = avro
agent1.sources.avro-source1.bind = 0.0.0.0
agent1.sources.avro-source1.port = 4545

agent1.sources.avro-source2.channels = ch2
agent1.sources.avro-source2.type = avro
agent1.sources.avro-source2.bind = 0.0.0.0
agent1.sources.avro-source2.port = 4546
agent1.sinks.hdfs-sink1.channel = ch1
agent1.sinks.hdfs-sink1.type = hdfs
agent1.sinks.hdfs-sink1.hdfs.path = hdfs://ip-10-4-71-187.ec2.internal/user/br/shim/eventstream/event/host101/
agent1.sinks.hdfs-sink1.hdfs.filePrefix = event
agent1.sinks.hdfs-sink1.hdfs.writeFormat = Text
agent1.sinks.hdfs-sink1.hdfs.rollInterval = 120
agent1.sinks.hdfs-sink1.hdfs.rollCount = 0
agent1.sinks.hdfs-sink1.hdfs.rollSize = 0
agent1.sinks.hdfs-sink1.hdfs.fileType = DataStream
agent1.sinks.hdfs-sink1.hdfs.batchSize = 1000
agent1.sinks.hdfs-sink1.hdfs.txnEventSize = 1000

agent1.sinks.hdfs-sink2.channel = ch2
agent1.sinks.hdfs-sink2.type = hdfs
agent1.sinks.hdfs-sink2.hdfs.path = hdfs://ip-10-4-71-187.ec2.internal/user/br/shim/eventstream/event/host102/
agent1.sinks.hdfs-sink2.hdfs.filePrefix = event
agent1.sinks.hdfs-sink2.hdfs.writeFormat = Text
agent1.sinks.hdfs-sink2.hdfs.rollInterval = 120
agent1.sinks.hdfs-sink2.hdfs.rollCount = 0
agent1.sinks.hdfs-sink2.hdfs.rollSize = 0
agent1.sinks.hdfs-sink2.hdfs.fileType = DataStream
agent1.sinks.hdfs-sink2.hdfs.batchSize = 1000
agent1.sinks.hdfs-sink2.hdfs.txnEventSize = 1000
agent1.sinks.hdfs-sink1-back.channel = ch1
agent1.sinks.hdfs-sink1-back.type = hdfs
agent1.sinks.hdfs-sink1-back.hdfs.path = hdfs://ip-10-110-69-240.ec2.internal/user/br/shim/eventstream/event/host101/
agent1.sinks.hdfs-sink1-back.hdfs.filePrefix = event
agent1.sinks.hdfs-sink1-back.hdfs.writeFormat = Text
agent1.sinks.hdfs-sink1-back.hdfs.rollInterval = 120
agent1.sinks.hdfs-sink1-back.hdfs.rollCount = 0
agent1.sinks.hdfs-sink1-back.hdfs.rollSize = 0
agent1.sinks.hdfs-sink1-back.hdfs.fileType = DataStream
agent1.sinks.hdfs-sink1-back.hdfs.batchSize = 1000
agent1.sinks.hdfs-sink1-back.hdfs.txnEventSize = 1000

agent1.sinks.hdfs-sink2-back.channel = ch2
agent1.sinks.hdfs-sink2-back.type = hdfs
agent1.sinks.hdfs-sink2-back.hdfs.path = hdfs://ip-10-110-69-240.ec2.internal/user/br/shim/eventstream/event/host102/
agent1.sinks.hdfs-sink2-back.hdfs.filePrefix = event
agent1.sinks.hdfs-sink2-back.hdfs.writeFormat = Text
agent1.sinks.hdfs-sink2-back.hdfs.rollInterval = 120
agent1.sinks.hdfs-sink2-back.hdfs.rollCount = 0
agent1.sinks.hdfs-sink2-back.hdfs.rollSize = 0
agent1.sinks.hdfs-sink2-back.hdfs.fileType = DataStream
agent1.sinks.hdfs-sink2-back.hdfs.batchSize = 1000
agent1.sinks.hdfs-sink2-back.hdfs.txnEventSize = 1000

agent1.sinkgroups.failoverGroup1.sinks = hdfs-sink1 hdfs-sink1-back
agent1.sinkgroups.failoverGroup1.processor.type = failover
#higher number in priority is higher priority
agent1.sinkgroups.failoverGroup1.processor.priority.hdfs-sink1 = 10
agent1.sinkgroups.failoverGroup1.processor.priority.hdfs-sink1-back = 5
#failover if failure detected for 10 seconds
agent1.sinkgroups.failoverGroup1.processor.maxpenalty = 10000
agent1.sinkgroups.failoverGroup2.sinks = hdfs-sink2 hdfs-sink2-back
agent1.sinkgroups.failoverGroup2.processor.type = failover
#higher number in priority is higher priority
agent1.sinkgroups.failoverGroup2.processor.priority.hdfs-sink2 = 10
agent1.sinkgroups.failoverGroup2.processor.priority.hdfs-sink2-back = 5
#failover if failure detected for 10 seconds
agent1.sinkgroups.failoverGroup2.processor.maxpenalty = 10000

# Finally, now that we've defined all of our components, tell
# agent1 which ones we want to activate.
agent1.sinkgroups = failoverGroup1 failoverGroup2
agent1.channels = ch1 ch2
agent1.sources = avro-source1 avro-source2
agent1.sinks = hdfs-sink1 hdfs-sink2 hdfs-sink1-back hdfs-sink2-back
________________________________
 From: Connor Woodson <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED]; Rahul Ravindran <[EMAIL PROTECTED]>
Sent: Monday, January 14, 2013 2:28 PM
Subject: Re: HDFSsink failover error
 

I assume that's only part of your config as it's missing a source; if you get rid of the sink processor, can you write to each hdfs sink individually? (comment one out at a time)

- Connor

On Mon, Jan 14, 2013 at 1:42 PM, Rahul Ravindran <[EMAIL PROTECTED]> wrote:

Hi,
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB