I'm researching about Flume as a solution for web analytics.
I read some texts about that, and my idea is to use Flume to collect the
logs and put in a Cassadra database. But first i have some doubts that I
Is a good approach process the log "in the fly" and insert it in the
Or is better collect the log, and store them (e.g. HDFS), and have
scheduled jobs with Pig and later insert in a database like HBase or
I found an interesting solution made by Gemini (now Cloudian) called
logprocessing, someone used it?
Jeff Lord 2013-02-25, 16:37
Daniel Bruno 2013-02-25, 20:38