Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> How to read Mahout generated sequence files in Pig


+
keeyong han 2013-02-04, 23:58
+
Harsha 2013-02-05, 00:06
+
David LaBarbera 2013-02-06, 20:37
Copy link to this message
-
RE: How to read Mahout generated sequence files in Pig
Thanks guys. That's what I figured out eventually. It works well in Apache Hadoop and CDH but not so well in Datastax package (DSE 2.1.1) though. I filed a bug report to Datastax on that.

Cheers,
-Keeyong

> Subject: Re: How to read Mahout generated sequence files in Pig
> From: [EMAIL PROTECTED]
> Date: Wed, 6 Feb 2013 15:37:05 -0500
> To: [EMAIL PROTECTED]
>
> The elephant bird sequence file loader should work, you'll just need to register the mahout jar with the vector writable they use.
>
> David
>
> On Feb 4, 2013, at 7:06 PM, Harsha <[EMAIL PROTECTED]> wrote:
>
> > keeyong,
> >    we used elephantbird( https://github.com/kevinweil/elephant-bird ) from twitter to read/write sequence files.
> > Take a look at these classes com.twitter.elephantbird.pig.store.SequenceFileStorage, com.twitter.elephantbird.pig.load.SequenceFileLoader.
> >
> > --
> > Harsha
> >
> >
> > On Monday, February 4, 2013 at 3:58 PM, keeyong han wrote:
> >
> >> I am wondering how I can read Mahout generated sequence files in Pig? I guess there might be a UDF but I can't find one yet.
> >>
> >> Cheers,
> >> -Keeyong
> >>
> >>
> >
>
     
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB