Thanks for the reply. My use case is not really special. We have
multiple products and each product emits traditional log messages in
different servers. I would like to stream those into HDFS. The logs are
generally in apache or log4j format.
So, I have many sources from where I want to stream the logs into HDFS.
I can have a channel/collector machine where I install flume. I guess,
my question is, do I need to install flume on the servers where the log
messages lie and do I need to install flume in HDFS namenode too?
On Wed, Feb 6, 2013 at 7:47 PM, Jeff Lord <[EMAIL PROTECTED]> wrote:
> It really is going to depend on your use case.
> Though it sounds that you may need to run an agent on each of the source
> Which source do you plan to use? It may also be the case that you can use
> the flume rpc client to write data directly from your application to the
> flume collector machine.
> On Wed, Feb 6, 2013 at 4:49 PM, Seshu V <[EMAIL PROTECTED]> wrote:
>> Hi All,
>> I have used Flume 0.9.3 a while back, it worked fine at that time.
>> Now, I am looking to use 'Flume NG', started reading documentation today.
>> In Flume 0.9.3, I installed flume agents on the servers wherever I had the
>> data source. And, I had a collector machine separately. My sink was
>> HDFS. I see that Flume NG is using Channel.
>> My question is that I have multiple source servers and my sink is
>> HDFS. I also have another machine for Channel (collector in old days).
>> Do I need to install flume NG in all the source machines and Channel
>> machine? Or can I install flume NG only on the Channel server and
>> (somehow) specify in the configuration to pull data from source machines
>> and specify the sink as HDFS?
>> Thanks in advance for your replies..
>> - Seshu