Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Using UDF to process whole record


+
Stanley Xu 2013-01-22, 06:50
+
Vitalii Tymchyshyn 2013-01-22, 11:27
+
Young Ng 2013-01-22, 07:34
+
Vitalii Tymchyshyn 2013-01-22, 11:34
Copy link to this message
-
Re: Using UDF to process whole record
Thanks all, looks I need to upgrade to pig0.10 since UDF(*) looks not
supported in 0.8.1

Best wishes,
Stanley Xu
On Tue, Jan 22, 2013 at 7:34 PM, Vitalii Tymchyshyn <[EMAIL PROTECTED]>wrote:

> BTW: http://pig.apache.org/docs/r0.10.0/basic.html has next example:
> C = FOREACH A GENERATE name, age, MyUDF(*);
> Looks like right what you need.
> 22 січ. 2013 09:35, "Young Ng" <[EMAIL PROTECTED]> напис.
>
> > try something like:
> >     raw_data = load ....... as record;
> >
> >     ...... com.udf.SomeUDF(record)
> >
> > or you can describe raw_data to check the schema.
> >
> >
> > On Jan 21, 2013, at 10:50 PM, Stanley Xu <[EMAIL PROTECTED]> wrote:
> >
> > > Dear all,
> > >
> > > We are using thrift and elephant-bird to store our logs. And I wanted
> to
> > > use some UDF to do complex processing on a single record, so I write
> some
> > > pig like the following:
> > >
> > > ========================================================> > > raw_data = load '$INPUT' using
> > > com.twitter.elephantbird.pig.load.LzoThriftBlockPigLoader('$CLASSNAME')
> > >
> > > A = FOREACH raw_data GENERATE com.udf.SomeUDF(raw_data);
> > > B = LIMIT A 10;
> > > DUMP B;
> > > ========================================================> > >
> > > But the pig will told me " ERROR 1000: Error during parsing. Scalars
> can
> > be
> > > only used with projections"
> > >
> > > Is there anyway I could run a UDF on the raw_data here?
> > >
> > >
> > > Best wishes,
> > > Stanley Xu
> >
> >
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB