|
|
+
Stanley Xu 2013-01-22, 06:50
+
Vitalii Tymchyshyn 2013-01-22, 11:27
+
Young Ng 2013-01-22, 07:34
+
Vitalii Tymchyshyn 2013-01-22, 11:34
-
Re: Using UDF to process whole recordStanley Xu 2013-01-24, 03:56
Thanks all, looks I need to upgrade to pig0.10 since UDF(*) looks not
supported in 0.8.1 Best wishes, Stanley Xu On Tue, Jan 22, 2013 at 7:34 PM, Vitalii Tymchyshyn <[EMAIL PROTECTED]>wrote: > BTW: http://pig.apache.org/docs/r0.10.0/basic.html has next example: > C = FOREACH A GENERATE name, age, MyUDF(*); > Looks like right what you need. > 22 січ. 2013 09:35, "Young Ng" <[EMAIL PROTECTED]> напис. > > > try something like: > > raw_data = load ....... as record; > > > > ...... com.udf.SomeUDF(record) > > > > or you can describe raw_data to check the schema. > > > > > > On Jan 21, 2013, at 10:50 PM, Stanley Xu <[EMAIL PROTECTED]> wrote: > > > > > Dear all, > > > > > > We are using thrift and elephant-bird to store our logs. And I wanted > to > > > use some UDF to do complex processing on a single record, so I write > some > > > pig like the following: > > > > > > ========================================================> > > raw_data = load '$INPUT' using > > > com.twitter.elephantbird.pig.load.LzoThriftBlockPigLoader('$CLASSNAME') > > > > > > A = FOREACH raw_data GENERATE com.udf.SomeUDF(raw_data); > > > B = LIMIT A 10; > > > DUMP B; > > > ========================================================> > > > > > But the pig will told me " ERROR 1000: Error during parsing. Scalars > can > > be > > > only used with projections" > > > > > > Is there anyway I could run a UDF on the raw_data here? > > > > > > > > > Best wishes, > > > Stanley Xu > > > > > |