Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Flume >> mail # dev >> Flume custom decorator for Rolling FileSink output bucketing


+
Dibyajyoti Ghosh 2013-03-15, 19:16
+
Mike Percy 2013-03-15, 19:38
+
Dibyajyoti Ghosh 2013-03-15, 21:20
+
Juhani Connolly 2013-03-18, 01:55
Copy link to this message
-
Re: Flume custom decorator for Rolling FileSink output bucketing
Hi Juhani,

Thank you very much for clarifying the doubts I had about the documentation
for quite some time now. I downloaded the flume source from git and now
looking into the HDFS sink code base. Like you said it will not be a small
patch. Will keep the community posted about the changes.

Are you aware of any plan to implement the output bucketting (i.e. dynamic
paths) to FileRoll sink in near future releases of Flume?

thanks a lot,
- dib
On Sun, Mar 17, 2013 at 6:55 PM, Juhani Connolly <
[EMAIL PROTECTED]> wrote:

> Dib, that article is in reference to flume OG(0.95), it's not relevant to
> the current release.
>
> I had looked in the past at fixing the file sink to use the same
> bucketting available to the hdfs sink, but unfortunately it seemed like it
> would take more than a quick fix. The PathManager currently only works with
> one File at a time, and the rolling logic is connected to that. You'd
> basically have to replace most of the logic, ideally reusing the bucketing
> logic from the HDFS sink. As Mike said, you should probably just use the
> HDFS sink with file:// unless you feel like improving the current sink.
>
>
> On 03/16/2013 06:20 AM, Dibyajyoti Ghosh wrote:
>
>> Thanks Mike for the suggestion. The reason I am thinking of usual file
>> system for log storage is to avoid latency issues for file retrieval as
>> well as to allow users to scrape log files using grep / awk and multitude
>> of other powerful commands available in conventional storage.
>>
>> I am now thinking of coming up with my own decorator classes for
>> RollingFile sink. Any pointers on how I can get started on writing my
>> custom decorators?
>>
>> Another quick question: Can you, Mike or somebody from flume community
>> tell
>> me how to use the commands documented here at:
>> http://archive.cloudera.com/**cdh/3/flume/UserGuide/#_**
>> introducing_sink_decorators<http://archive.cloudera.com/cdh/3/flume/UserGuide/#_introducing_sink_decorators>
>>
>>
>> Is this available for flume-ng distributed with Cloudera solution i.e.
>> flume 1.3.0?
>>
>> Best and thanks a lot again,
>> - dib
>>
>>
>> On Fri, Mar 15, 2013 at 12:38 PM, Mike Percy <[EMAIL PROTECTED]> wrote:
>>
>>  Dib, you could use the HDFS sink with a file:// URL as an option.
>>>
>>> Regards,
>>> Mike
>>>
>>>
>>>
>>> On Fri, Mar 15, 2013 at 12:16 PM, Dibyajyoti Ghosh <
>>> [EMAIL PROTECTED]> wrote:
>>>
>>>  Dear flume team,
>>>>
>>>> I am using flume 1.3.0 bundled with Cloudera 4.2.0 distribution for log
>>>>
>>> to
>>>
>>>> local file system. But current implementation of FileSink doesn't have
>>>> inline decorators like in HDFS Sink where output can be stored to
>>>> directories based on event meta data e.g. hostname of the event or
>>>> timestamp or some other attribute in the message object.
>>>>
>>>> How can I do the same for FileSink?
>>>>
>>>>
>>>> Thanks a lot,
>>>> - dib
>>>>
>>>>
>
+
Juhani Connolly 2013-03-19, 03:15
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB