Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> Roll based on date


Copy link to this message
-
Re: Roll based on date
The SyslogTcpSource will put a header on the flume event named 'timestamp'.
This timestamp will be from the syslog entry. You could then set the
filePrefix in the sink to grab this out.
For example

tier1.sinks.hdfsSink.hdfs.filePrefix = FlumeData.%{timestamp}

dave
On Thu, Oct 17, 2013 at 10:23 PM, Martinus m <[EMAIL PROTECTED]> wrote:

> Hi David,
>
> It's syslogtcp.
>
> Thanks.
>
> Martinus
>
>
> On Thu, Oct 17, 2013 at 9:17 PM, David Sinclair <
> [EMAIL PROTECTED]> wrote:
>
>> What type of source are you using?
>>
>>
>> On Wed, Oct 16, 2013 at 9:56 PM, Martinus m <[EMAIL PROTECTED]>wrote:
>>
>>> Hi,
>>>
>>> Is there any option in HDFS sink that I can start rolling a new file
>>> whenever the date in the log change? For example, I got below logs :
>>>
>>> Oct 16 23:58:56 test-host : just test
>>> Oct 16 23:59:51 test-host : test again
>>> Oct 17 00:00:56 test-host : just test
>>> Oct 17 00:00:56 test-host : test again
>>>
>>> Then I want it to make a file on S3 bucket with result like this :
>>>
>>> FlumeData.2013-10-16.1381916293017 <-- all the logs with Oct 16 from
>>> this year 2013 will goes to here and when it's reach Oct 17 year 2013, then
>>> it will start to sink into a new file below :
>>>
>>> FlumeData.2013-10-17.1381940047117
>>>
>>> Thanks.
>>>
>>
>>
>