Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> Roll based on date


Copy link to this message
-
Re: Roll based on date
Hi David,

Following is my configuration file :

agent.sources = seqGenSrc
agent.channels = fileChannel
agent.sinks = s3Sink

# For each one of the sources, the type is defined
agent.sources.seqGenSrc.type = syslogtcp
agent.sources.seqGenSrc.port = 5140
agent.sources.seqGenSrc.host = localhost
agent.sources.seqGenSrc.keepFields = true

# The channel can be defined as follows.
agent.sources.seqGenSrc.channels = fileChannel

# Each sink's type must be defined
agent.sinks.s3Sink.type = hdfs

#Specify the channel the sink should use
agent.sinks.s3Sink.channel = fileChannel
agent.sinks.s3Sink.hdfs.path = s3n://awskeyid:awssecretkey@bucket_name
/%{host}
agent.sinks.s3Sink.hdfs.filePrefix = FlumeData.%Y-%m-%d
agent.sinks.s3Sink.hdfs.rollInterval = 0
agent.sinks.s3Sink.hdfs.rollSize = 0
agent.sinks.s3Sink.hdfs.rollCount = 0
agent.sinks.s3Sink.hdfs.batchSize = 0
agent.sinks.s3Sink.hdfs.idleTimeout = 600
agent.sinks.s3Sink.hdfs.fileType = DataStream

# Each channel's type is defined.
agent.channels.fileChannel.type = file

# Other config values specific to each type of channel(sink or source)
# can be defined as well
# In this case, it specifies the capacity of the memory channel
agent.channels.fileChannel.capacity = 1000000

Thanks.

Martinus
On Fri, Oct 25, 2013 at 10:20 PM, David Sinclair <
[EMAIL PROTECTED]> wrote:

> does the metrics endpoint show that events are still coming into this sink?
>
> http://hostname of agent:41414/metrics <http://falcon:41414/metrics>
>
> Also, can you post the rest of the config?
>
>
> On Thu, Oct 24, 2013 at 10:09 PM, Martinus m <[EMAIL PROTECTED]>wrote:
>
>> Hi David,
>>
>> Almost every few seconds.
>>
>> Thanks.
>>
>> Martinus
>>
>>
>> On Thu, Oct 24, 2013 at 9:49 PM, David Sinclair <
>> [EMAIL PROTECTED]> wrote:
>>
>>> How often are your events coming in?
>>>
>>>
>>> On Thu, Oct 24, 2013 at 2:21 AM, Martinus m <[EMAIL PROTECTED]>wrote:
>>>
>>>> Hi David,
>>>>
>>>> Thanks for the example. I have set it just like above, but it only
>>>> generate for the first 15 minutes. After waiting for more than one hour,
>>>> there is no update at all in the s3 bucket.
>>>>
>>>> Thanks.
>>>>
>>>> Martinus
>>>>
>>>>
>>>> On Wed, Oct 23, 2013 at 8:48 PM, David Sinclair <
>>>> [EMAIL PROTECTED]> wrote:
>>>>
>>>>> You can set all of the time/size based rolling policies to zero and
>>>>> set an idle timeout on the sink. Below has a 15 minute timeout
>>>>>
>>>>> agent.sinks.sink.hdfs.fileSuffix = FlumeData.%Y-%m-%d
>>>>> agent.sinks.sink.hdfs.fileType = DataStream
>>>>> agent.sinks.sink.hdfs.rollInterval = 0
>>>>> agent.sinks.sink.hdfs.rollSize = 0
>>>>> agent.sinks.sink.hdfs.batchSize = 0
>>>>> agent.sinks.sink.hdfs.rollCount = 0
>>>>> agent.sinks.sink.hdfs.idleTimeout = 900
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Oct 22, 2013 at 10:17 PM, Martinus m <[EMAIL PROTECTED]>wrote:
>>>>>
>>>>>> Hi David,
>>>>>>
>>>>>> The requirement is only roll per day actually.
>>>>>>
>>>>>> Hi Devin,
>>>>>>
>>>>>> Thanks for sharing your experienced. I also tried to set the config
>>>>>> as following :
>>>>>>
>>>>>> agent.sinks.sink.hdfs.fileSuffix = FlumeData.%Y-%m-%d
>>>>>> agent.sinks.sink.hdfs.fileType = DataStream
>>>>>> agent.sinks.sink.hdfs.rollInterval = 0
>>>>>> agent.sinks.sink.hdfs.rollSize = 0
>>>>>> agent.sinks.sink.hdfs.batchSize = 15000
>>>>>> agent.sinks.sink.hdfs.rollCount = 0
>>>>>>
>>>>>> But I didn't see anything on the s3 bucket. So I guess, I need to
>>>>>> change the rollInterval into 86400. In my understanding, rollInterval 86400
>>>>>> will roll the file after 24 hours like you said, but it will not generate
>>>>>> new file if it's changed the day and haven't been 24 hours interval (unless
>>>>>> we put prefix to fileSuffix as above).
>>>>>>
>>>>>> Thanks to both of you.
>>>>>>
>>>>>> Best regards,
>>>>>>
>>>>>> Martinus
>>>>>>
>>>>>>
>>>>>> On Tue, Oct 22, 2013 at 11:16 PM, DSuiter RDX <[EMAIL PROTECTED]>wrote:
>>>>>>
>>>>>>> Martinus, you have to set all the other roll options to 0 explicitly