Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Using UDF to process whole record


Copy link to this message
-
Re: Using UDF to process whole record
BTW: http://pig.apache.org/docs/r0.10.0/basic.html has next example:
C = FOREACH A GENERATE name, age, MyUDF(*);
Looks like right what you need.
22 січ. 2013 09:35, "Young Ng" <[EMAIL PROTECTED]> напис.

> try something like:
>     raw_data = load ....... as record;
>
>     ...... com.udf.SomeUDF(record)
>
> or you can describe raw_data to check the schema.
>
>
> On Jan 21, 2013, at 10:50 PM, Stanley Xu <[EMAIL PROTECTED]> wrote:
>
> > Dear all,
> >
> > We are using thrift and elephant-bird to store our logs. And I wanted to
> > use some UDF to do complex processing on a single record, so I write some
> > pig like the following:
> >
> > ========================================================> > raw_data = load '$INPUT' using
> > com.twitter.elephantbird.pig.load.LzoThriftBlockPigLoader('$CLASSNAME')
> >
> > A = FOREACH raw_data GENERATE com.udf.SomeUDF(raw_data);
> > B = LIMIT A 10;
> > DUMP B;
> > ========================================================> >
> > But the pig will told me " ERROR 1000: Error during parsing. Scalars can
> be
> > only used with projections"
> >
> > Is there anyway I could run a UDF on the raw_data here?
> >
> >
> > Best wishes,
> > Stanley Xu
>
>