Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Using UDF to process whole record


+
Stanley Xu 2013-01-22, 06:50
+
Vitalii Tymchyshyn 2013-01-22, 11:27
+
Young Ng 2013-01-22, 07:34
+
Vitalii Tymchyshyn 2013-01-22, 11:34
Copy link to this message
-
Re: Using UDF to process whole record
Thanks all, looks I need to upgrade to pig0.10 since UDF(*) looks not
supported in 0.8.1

Best wishes,
Stanley Xu
On Tue, Jan 22, 2013 at 7:34 PM, Vitalii Tymchyshyn <[EMAIL PROTECTED]>wrote:

> BTW: http://pig.apache.org/docs/r0.10.0/basic.html has next example:
> C = FOREACH A GENERATE name, age, MyUDF(*);
> Looks like right what you need.
> 22 січ. 2013 09:35, "Young Ng" <[EMAIL PROTECTED]> напис.
>
> > try something like:
> >     raw_data = load ....... as record;
> >
> >     ...... com.udf.SomeUDF(record)
> >
> > or you can describe raw_data to check the schema.
> >
> >
> > On Jan 21, 2013, at 10:50 PM, Stanley Xu <[EMAIL PROTECTED]> wrote:
> >
> > > Dear all,
> > >
> > > We are using thrift and elephant-bird to store our logs. And I wanted
> to
> > > use some UDF to do complex processing on a single record, so I write
> some
> > > pig like the following:
> > >
> > > ========================================================> > > raw_data = load '$INPUT' using
> > > com.twitter.elephantbird.pig.load.LzoThriftBlockPigLoader('$CLASSNAME')
> > >
> > > A = FOREACH raw_data GENERATE com.udf.SomeUDF(raw_data);
> > > B = LIMIT A 10;
> > > DUMP B;
> > > ========================================================> > >
> > > But the pig will told me " ERROR 1000: Error during parsing. Scalars
> can
> > be
> > > only used with projections"
> > >
> > > Is there anyway I could run a UDF on the raw_data here?
> > >
> > >
> > > Best wishes,
> > > Stanley Xu
> >
> >
>