Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> UDF that takes bag as input and returns another bag


Copy link to this message
-
Re: UDF that takes bag as input and returns another bag

But he asked for a function that returns *another* bag ;)

Snark aside, when returning bags or tuples, it's also worthwhile to at
least consider also defining the output schema, which for your example
code would probably mean

public Schema outputSchema(Schema input){
  Schema output = new Schema();
  output.add(input.getField(0));
  return output;
}

(possibly with some omitted exception handling)

-Kris

On Mon, Mar 18, 2013 at 11:19:17AM +0100, Jonathan Coveney wrote:
> Absolutely.
>
> public class MyUdf extends EvalFunc<DataBag> {
>   public DataBag exec(Tuple input) throws IOException {
>     return (DataBag)input.get(0);
>   }
> }
>
>
> A dummy example, but there you go. DataBag is a valid pig type like any
> other, so you just returnit like you would normally.
>
>
> 2013/3/18 pranjal rajput <[EMAIL PROTECTED]>
>
> > Hi,
> > Can we define a UDF in pig that takes a bag as an input and returns another
> > bag as output?
> > How can this be done?
> > Thanks,
> > --
> > regards
> > Pranjal
> >

--
Kris Coward http://unripe.melon.org/
GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB