Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> why flume agent do not support --fileheader

Copy link to this message
Re: why flume agent do not support --fileheader
Hi yogi,

It seems there are no common methods.

for spooldir source, you can setagent1.sources.gamelog.fileHeader=true
and agent1.sources.gamelog.fileHeaderKey=fullfilename get fullpath of the
file. but I don't know how to get filename from fullpath just using

if the file name is not changing and file numbers not so much,maybe you can
using  interceptor to configure some file name info.

Best Regards,

2013/2/23 Yogi Nerella <[EMAIL PROTECTED]>

> Hi,
> I want to ship various types of files from one agent to another agent, and
> a destination agent should rewrite them to the same filename as the source
> filename.
> Like a simple FTP source to FTP destination, but I need it via flume
> because I want to do some intercepting and processing?   Is there any
> simple configuration for it?   Or Flume cannot support such functionality?
> Thanks,
> Yogi
> On Fri, Feb 22, 2013 at 12:36 AM, 周梦想 <[EMAIL PROTECTED]> wrote:
>> ok,I see.
>> 2013/2/22 Juhani Connolly <[EMAIL PROTECTED]>
>>>  Sounds like a job for an editor macro or a simple script.
>>> Perhaps someone else can think of something else, but adding it to the
>>> command line isn't feasible... If we started throwing random convenient
>>> stuff there it would quickly become a mess
>>> On 02/22/2013 01:30 PM, 周梦想 wrote:
>>> Yes,I mean all event headers send from a specific agent. some headers
>>> need for all events, not only hostname. now I have to configure it on every
>>> source time and time again.
>>> for example:
>>> if i configure 10 sources, every source need 5 header,  I have to
>>> configure 50 times.
>>> other configure of the same type source have the same problem.
>>>  eg:
>>> agent1.sources =gamelog src1 src2 ... src10
>>>  agent1.sources.gamelog.fileSuffix=.fin
>>> agent1.sources.gamelog.fileHeader=true
>>> agent1.sources.gamelog.fileHeaderKey=fullfilename
>>> agent1.sources.gamelog.batchSize=100
>>> agent1.sources.gamelog.bufferMaxLines=1000
>>> agent1.sources.gamelog.bufferMaxLineLength=5000
>>>  agent1.sources.gamelog.interceptors = i1 i2 i3
>>> #for %{host} org.apache.flume.interceptor.HostInterceptor$Builder
>>> agent1.sources.gamelog.interceptors.i1.type = host
>>>  agent1.sources.gamelog.interceptors.i2.type = timestamp
>>> ...
>>>  agent1.sources.gamelog.interceptors.i3.type = static
>>> agent1.sources.gamelog.interceptors.i3.key = filename
>>> agent1.sources.gamelog.interceptors.i3.value = gamelog
>>>  *repleat configure for src1,...src10 for 10 or even more times.*
>>>  it's very boring.
>>> maybe we could group the sources,sinks and configure them one or several
>>> times.
>>>  Best Regards,
>>> Andy
>>> 2013/2/22 Juhani Connolly <[EMAIL PROTECTED]>
>>>> If I understand your question, using a static interceptor is the
>>>> expected way to do this.
>>>> Not sure what you mean by agent header? Do you mean all event headers
>>>> sent from a specific agent? I don't imagine we will be adding a
>>>> command-line parameter to do this, it wouldn't be consistent and would be
>>>> superfluous.
>>>> If you want to reuse a configuration file, and add a header to inform
>>>> what agent it came from, perhaps you could use the hostname interceptor?
>>>> On 02/22/2013 12:06 PM, 周梦想 wrote:
>>>>> I want add some key/value to the agent header, but it's not convenient
>>>>> to do so. why not flume agent support --headerFile just like flume
>>>>> avro-client?
>>>>> my requirement is :
>>>>> I want use spooling source to send files to another flume node to
>>>>> write to hdfs.
>>>>> and i want to name hdfs file to have the original file name prefix,
>>>>> not the full path of the orgin file.
>>>>> now I have to add interceptors of conf to do that.
>>>>> so I have the question, why not flume agent support --headerFile just
>>>>> like flume avro-client?
>>>>> Best Regards,
>>>>> Andy