Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Chukwa >> mail # user >> How to extract only the desired information using Chuka


+
Mohammad Tariq 2011-11-17, 10:47
+
Ahmed Fathalla 2011-11-17, 10:50
+
Mohammad Tariq 2011-11-17, 10:53
Copy link to this message
-
Re: How to extract only the desired information using Chuka
The data stored in Hadoop after the demux process is a sequence file
containing the data. One easy way to get this is to use Pig via the
ChukwaLoader:

http://svn.apache.org/viewvc/incubator/chukwa/trunk/contrib/chukwa-pig/src/java/org/apache/hadoop/chukwa/pig/ChukwaLoader.java?view=markup

Note that it's using the SequenceFileRecordReader like this to read the
data, so if you don't want to use Pig, you could do something similar.
SequenceFileRecordReader<ChukwaRecordKey, ChukwaRecord>

The ChukwaRecord contains a handful of fields created by the Processor that
you've configured to collect your data. If you're using the TSProcessor, I
think the payload is in a field called 'body' IIRC.

There's also a command line java tool to dump the contents of a sequence
file to stdout, which can be handy. I forget what it's called, but it
should be in the docs.

On Thu, Nov 17, 2011 at 2:53 AM, Mohammad Tariq <[EMAIL PROTECTED]> wrote:

> Oh, in that case i have to wait for their reply and keep on trying
> till then..Thanks for the reply Ahmed.
>
> Regards,
>     Mohammad Tariq
>
>
>
> On Thu, Nov 17, 2011 at 4:20 PM, Ahmed Fathalla <[EMAIL PROTECTED]>
> wrote:
> > Hmm...maybe in the demux part of the system ( I think it utilizes pig
> > scripts somewhere). I'm not an expert in this, maybe Ari, Bill or Eric
> can
> > help on this.
> >
> > On Thu, Nov 17, 2011 at 12:47 PM, Mohammad Tariq <[EMAIL PROTECTED]>
> wrote:
> >>
> >> Is it possible for us to extract only the actual content present
> >> inside a file without any other information, using Chukwa??
> >>
> >> Regards,
> >>     Mohammad Tariq
> >
> >
> >
> > --
> > Ahmed Fathalla
> >
>
+
Mohammad Tariq 2011-11-17, 18:32
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB