Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Many to One UDF Problem


Copy link to this message
-
Re: Many to One UDF Problem
Dipesh,

You can pass in the entire tuple (row) to the UDF.

unintended = foreach player generate name, id, Dudf_try(*);

And the UDF now will be able to use the entire row :

Tuple tuple = (Tuple)input.get(0);

To process individual fields, you can iterate or positionally access the
above tuple.

String name = tuple.get(0).toString();
String id = tuple.get(1).toString();

-Prashant
On Wed, May 9, 2012 at 12:00 PM, DIPESH KUMAR SINGH
<[EMAIL PROTECTED]>wrote:

> (Yet another basic udf question)
>
> I want my udf to take values of all the columns in a row.
>
> For example: If there are 3 records in my input file. (Tab delimited row)
>
> John   12
> Jeff     33
> Chin    20
>
> Currently my UDF could only take one, (I don't know how to do more than
> one):
>
> *register 'dudf.jar';**
> **player = load '/pig_data/dxmlsample1.txt' as (name:chararray,
> id:chararray);*
> *-- As i have only passed name here, I want whole row to be passed, i.e.
> name and id. (here)**
> **unintended = foreach player generate name, id, Dudf_try(name);**
> **dump unintended;*
>
> My UDF code is:
>
> *import java.io.IOException;**
> **import java.util.List;**
> **import java.util.ArrayList;**
> **
> **import org.apache.pig.EvalFunc;**
> **import org.apache.pig.FuncSpec;**
> **import org.apache.pig.data.Tuple;**
> **import org.apache.pig.data.DataType;**
> **import org.apache.pig.impl.logicalLayer.schema.Schema;**
> **import org.apache.pig.impl.logicalLayer.FrontendException;**
> **
> **public class Dudf_try extends EvalFunc<String> {**
> ** public String exec(Tuple input) throws IOException {**
> ** if(input == null || input.size() == 0)**
> ** return null;**
> ** try{**
> ** String query = (String)input.get(0);**
> **  //String query1 = (String)input.get(1);**
> ** **
> ** // Some more transformation here , but ultimate Output is String**
> ** **
> ** return query+"<>"+query1;**
> ** }catch(Exception e){**
> ** System.err.println("failed to process input; error - " +
> e.getMessage());
> **
> ** return null;**
> ** }**
> ** }**
> **
> ** @Override**
> ** public Schema outputSchema(Schema input) {**
> ** return new Schema(new
> Schema.FieldSchema(getSchemaName(this.getClass().getName().toLowerCase(),
> input), DataType.CHARARRAY));**
> ** }**
> **
> ** @Override**
> ** public List<FuncSpec> getArgToFuncMapping() throws FrontendException {**
> ** List<FuncSpec> funcList = new ArrayList<FuncSpec>();**
> ** funcList.add(new FuncSpec(this.getClass().getName(), new Schema(new
> Schema.FieldSchema(null, DataType.CHARARRAY))));**
> **
> ** return funcList;**
> ** }**
> **
> **}*
> *
> *
>
> I need some suggestion here on how to proceed with the intended.
>
> Thanks!
> Dipesh
>
> --
> Dipesh Kr. Singh
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB