Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Flume >> mail # user >> flume-cassandra

Copy link to this message
Re: flume-cassandra
See below.

On Jan 29, 2013, at 11:43 AM, Yogi Nerella wrote:

> Ralph,
> Sorry Ramya for side tracking this thread.
> I am also looking for some open source tools to collect all log records from various applications into a central place.
> Looking at Cassandra File System as well, but really do not understand what value we get by storing in it, instead of storing in NFS based file system.

Cassandra is not a file system. It is a NoSQL solution that stores the data elements as columns in rows.  See http://nosql.mypopescu.com/post/2981945438/why-netflix-picked-amazon-simpledb-hadoop-hbase-and for a nice discussion on the differences between Hadoop, Cassandra and SimpleDB.
> Are there any good UI tools for searching the CFS database for administrators to look at errors?

No.  That is similar to asking if there are good tools to looking for "xxxx" in database "yyyy" where "xxxx" is something specific to your problem.   As a NoSQL database Cassandra is not solving a specific problem such as capturing logs.  

> Is MDC working with flume log4jappender?   I couldnt make it work?

Flume's Log4j appender is for Log4j 1.x.  I am working on Log4j 2 which is a complete rewrite and is designed for modern JDKs and to be able to reliably handle audit logging. See http://logging.apache.org/log4j/2.x/manual/appenders.html#FlumeAppender.  Log4j 2 supports both writing to a remote agent over Avro or an embedded agent.

> log4jappender of flume has so many dependencies, are there any good "SocketAppenders", which can store and forward in the case of the server down?

One of the unfortunate aspects of Flume is that it does have a lot of dependencies. When you use the Avro remote Appender you will only need a few jars from Log4j,  Flume, Avro and possibly a couple more that they depend on. If you use the embedded appender you will need quite a few more since you are essentially running a Flume agent in your application.

Generic socket appenders generally suffer from the problem that they cannot guarantee delivery.  IOW, you get control back as soon as TCP says the data was delivered. It might never make it into the Flume channel.  When you use Avro you do get guaranteed delivery as you don't get control back until the data is written to the Flume channel. A "Socket Appender" that can store and forward would be the Flume embedded appender in Log4j 2.

> Can I use "SocketAppender" and send the data to "flume agent" configured to receive netcat messages?

Yes, you can use a Socket Appender.  Unlike other logging frameworks, with Log4j 2 you can use a Layout to format the data going over the socket connection in any manner you want.