We use chukwa for near-real time trending, conceptually similar to near-real
time anomaly detection.
We use Chukwa agents, collectors and Demux to collect log data in 5 minute
increments which we then run MR jobs on, as Ari describes. It works well for
On Sun, May 29, 2011 at 10:02 AM, Ariel Rabkin <[EMAIL PROTECTED]> wrote:
> My impression is that web log analysis is the main use that people are
> putting Chukwa to.
> The idea is that you scoop up web logs, throw them into HDFS, and then
> run Pig jobs.
> On Sun, May 29, 2011 at 4:39 AM, Amos Shapira <[EMAIL PROTECTED]>
> > In case this interests anyone - I'm following Chukwa for such purposes
> > Not just Google Analytics- like but also hoping to use it for near real
> > anomaly detection...
> > On 29 May 2011 18:19, Nikola Veber <[EMAIL PROTECTED]> wrote:
> >> Hello,
> >> I have just discovered Chukwa, and after the initial feeling that it
> >> would be a great tool to process large quantities of web-logs and
> >> generate statistics like google analytics and co, I started searching
> >> the web for hints - but I couldn't find any clue regarding this.
> >> Has anyone tried using Chukwa for Web-Analytics, or do you know any
> >> a-priori limitations which speak against using it in this manner?
> >> Thanks,
> >> NIkola
> Ari Rabkin [EMAIL PROTECTED]
> UC Berkeley Computer Science Department