Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> Writing to HDFS from multiple HDFS agents (separate machines)


Copy link to this message
-
Re: Writing to HDFS from multiple HDFS agents (separate machines)
I could differentiate different sources using this config by creating
separate directories by hostname:

agent.sources.syslogsrc.interceptors = ts
agent.sources.syslogsrc.interceptors.ts.type = timestamp
agent.sinks.hdfsSink.hdfs.path hdfs://<ip_addr>:<port>/flumetest/%{host}/%y-%m-%d

However, I have a question related to this.  When two different products
are sending their logs to one source and I am collecting them via syslog.
 Is there a way to differentiate two different product logs coming from
single source in flume?  I would ideally like to have sub directory at the
sink like '/flumetest/%{host}/<product_name>/%y-%m-%d.  How can I do this?

Thanks,
- Seshu
On Thu, Mar 14, 2013 at 5:00 PM, Mohammad Tariq <[EMAIL PROTECTED]> wrote:

> Hello sir,
>
>     One idea could be to create the sub directories with the machines'
> hostnames, in case you are getting data from multiple sources. you can
> easily find out which data belongs to which machine then.
>
> Warm Regards,
> Tariq
> https://mtariq.jux.com/
> cloudfront.blogspot.com
>
>
> On Fri, Mar 15, 2013 at 3:24 AM, Gary Malouf <[EMAIL PROTECTED]>wrote:
>
>> Hi guys,
>>
>> I'm new to flume (hdfs for that metter), using the version packaged with
>> CDH4 (1.3.0) and was wondering how others are maintaining different file
>> names being written to per HDFS sink.
>>
>> My initial thought is to create a separate sub-directory in hdfs for each
>> sink - though I feel like the better way is to somehow prefix each file
>> with a unique sink id.  Are there any patterns that others are following
>> for this?
>>
>> -Gary
>>
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB