Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Using UDF to process whole record


+
Stanley Xu 2013-01-22, 06:50
+
Vitalii Tymchyshyn 2013-01-22, 11:27
Copy link to this message
-
Re: Using UDF to process whole record
try something like:
    raw_data = load ....... as record;

    ...... com.udf.SomeUDF(record)

or you can describe raw_data to check the schema.
On Jan 21, 2013, at 10:50 PM, Stanley Xu <[EMAIL PROTECTED]> wrote:

> Dear all,
>
> We are using thrift and elephant-bird to store our logs. And I wanted to
> use some UDF to do complex processing on a single record, so I write some
> pig like the following:
>
> ========================================================> raw_data = load '$INPUT' using
> com.twitter.elephantbird.pig.load.LzoThriftBlockPigLoader('$CLASSNAME')
>
> A = FOREACH raw_data GENERATE com.udf.SomeUDF(raw_data);
> B = LIMIT A 10;
> DUMP B;
> ========================================================>
> But the pig will told me " ERROR 1000: Error during parsing. Scalars can be
> only used with projections"
>
> Is there anyway I could run a UDF on the raw_data here?
>
>
> Best wishes,
> Stanley Xu
+
Vitalii Tymchyshyn 2013-01-22, 11:34
+
Stanley Xu 2013-01-24, 03:56
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB