Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Using UDF to process whole record


Copy link to this message
-
Using UDF to process whole record
Dear all,

We are using thrift and elephant-bird to store our logs. And I wanted to
use some UDF to do complex processing on a single record, so I write some
pig like the following:

========================================================raw_data = load '$INPUT' using
com.twitter.elephantbird.pig.load.LzoThriftBlockPigLoader('$CLASSNAME')

A = FOREACH raw_data GENERATE com.udf.SomeUDF(raw_data);
B = LIMIT A 10;
DUMP B;
========================================================
But the pig will told me " ERROR 1000: Error during parsing. Scalars can be
only used with projections"

Is there anyway I could run a UDF on the raw_data here?
Best wishes,
Stanley Xu
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB