Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Chukwa >> mail # user >> Using Chukwa for Web Analytics


Copy link to this message
-
Re: Using Chukwa for Web Analytics
We use chukwa for near-real time trending, conceptually similar to near-real
time anomaly detection.

We use Chukwa agents, collectors and Demux to collect log data in 5 minute
increments which we then run MR jobs on, as Ari describes. It works well for
us.
On Sun, May 29, 2011 at 10:02 AM, Ariel Rabkin <[EMAIL PROTECTED]> wrote:

> My impression is that web log analysis is the main use that people are
> putting Chukwa to.
> The idea is that you scoop up web logs, throw them into HDFS, and then
> run Pig jobs.
>
> --Ari
>
> On Sun, May 29, 2011 at 4:39 AM, Amos Shapira <[EMAIL PROTECTED]>
> wrote:
> > In case this interests anyone - I'm following Chukwa for such purposes
> too.
> > Not just Google Analytics- like but also hoping to use it for near real
> time
> > anomaly detection...
> >
> > On 29 May 2011 18:19, Nikola Veber <[EMAIL PROTECTED]> wrote:
> >>
> >> Hello,
> >>
> >> I have just discovered Chukwa, and after the initial feeling that it
> >> would be a great tool to process large quantities of web-logs and
> >> generate statistics like google analytics and co, I started searching
> >> the web for hints - but I couldn't find any clue regarding this.
> >>
> >> Has anyone tried using Chukwa for Web-Analytics, or do you know any
> >> a-priori limitations which speak against using it in this manner?
> >>
> >>
> >> Thanks,
> >> NIkola
> >
> >
>
>
>
> --
> Ari Rabkin [EMAIL PROTECTED]
> UC Berkeley Computer Science Department
>