Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> flume 1.4.0 avro source/sink with hdfs sink configuration - no hdfs files created


Copy link to this message
-
flume 1.4.0 avro source/sink with hdfs sink configuration - no hdfs files created
Hi
I have setup the following flume topology but nothing gets written to hdfs.
There are no errors either. Do you know whats going wrong?

I have a stand alone configuration for just  *tail exec source -> file
channel -> hdfs sink* working but when I use avro, its getting messed up.
*exec tail -f source -> file channel -> avro sink -> avro source -> file
channel -> hdfs sink*
flume-avro.conf as follows -

agent.sources = reader
  agent.channels = fileChannel
  agent.sinks = avro-forward-sink
  # For each one of the sources, the type is defined
  agent.sources.reader.type = exec

  agent.sources.reader.command = tail -f /opt/mapr/logs/configure.log
  # stderr is simply discarded, unless logStdErr=true

  # If the process exits for any reason, the source also exits and
will produce no further data.
  agent.sources.reader.logStdErr = true

  agent.sources.reader.restart = true

  # The channel can be defined as follows.

  agent.sources.reader.channels = fileChannel

  # Each sink's type must be defined

  agent.sinks.avro-forward-sink.type = avro
  agent.sinks.avro-forward-sink.hostname = localhost
  agent.sinks.avro-forward-sink.port = 41414

  #Specify the channel the sink should use
  agent.sinks.avro-forward-sink.channel = fileChannel

  # Each channel's type is defined.
  agent.channels.fileChannel.type = FILE

  # Other config values specific to each type of channel(sink or source)

  # can be defined as well
  agent.channels.fileChannel.type = FILE
  agent.channels.fileChannel.transactionCapacity = 1000000
  agent.channels.fileChannel.checkpointInterval 30000
  agent.channels.fileChannel.maxFileSize = 2146435071

  agent.channels.fileChannel.capacity 10000000

############################################################
  agent.sources = avro-collection-source
  agent.channels = channel1

  agent.sinks = hdfs-sink

  # For each one of the sources, the type is defined

  agent.sources.avro-collection-source.type = avro
  agent.sources.avro-collection-source.bind = 0.0.0.0
  agent.sources.avro-collection-source.port = 41414

  # The channel can be defined as follows.
  agent.sources.avro-collection-source.channels = channel1
  agent.sinks.hdfs-sink.channel = channel1

  # Each sink's type must be defined
  agent.sinks.hdfs-sink.type = hdfs
  agent.sinks.hdfs-sink.kerberosPrincipal = flume/[EMAIL PROTECTED]
  agent.sinks.hdfs-sink.kerberosKeytab = /opt/mapr/conf/flume.keytab
  agent.sinks.hdfs-sink.path = /user/root/flume/log_test7/
  agent.sinks.hdfs-sink.filePrefix = LogCreateTest
  agent.sinks.hdfs-sink.rollInterval = 6

  agent.sinks.hdfs-sink.rollSize = 0
  agent.sinks.hdfs-sink.rollCount = 10000
  agent.sinks.hdfs-sink.batchSize = 10000
  agent.sinks.hdfs-sink.txnEventMax = 40000
  agent.sinks.hdfs-sink.fileType = DataStream

  agent.sinks.hdfs-sink.maxOpenFiles=50
  agent.sinks.hdfs-sink.appendTimeout = 10000
  agent.sinks.hdfs-sink.callTimeout = 10000
  agent.sinks.hdfs-sink.threadsPoolSize=100
  agent.sinks.hdfs-sink.rollTimerPoolSize = 1

  #Specify the channel the source and sink should use
  agent.sources.avro-collection-source.channels = channel1
  agent.sinks.hdfs-sink.channel = channel1

  agent.channels.channel1.type = FILE
  agent.channels.channel1.transactionCapacity = 1000000
  agent.channels.channel1.checkpointInterval 30000

  agent.channels.channel1.maxFileSize = 2146435071
  agent.channels.channel1.capacity 10000000

Thanks,
Suhas.