Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> outputSchema for UDF EvalFunc returning a DataBag


Copy link to this message
-
Re: outputSchema for UDF EvalFunc returning a DataBag
Utils.getSchemaFromString() seems like exactly what you want (
from org_apache_pig_impl_util ).

Raghu.

[btw. my two previous attempts to send to the list got rejected as spam ]

On Mon, Oct 3, 2011 at 3:41 PM, Andrew Clegg
<andrew.clegg+[EMAIL PROTECTED]>wrote:

> Thanks Raghu (and Dmitry).
>
> Could this maybe get added to the docs page on UDFs? (Apologies if
> it's there already and I missed it.)
>
> Also -- it's a bit cumbersome writing all these nested Schema and
> FieldSchema constructors, especially when you're writing tests for
> UDFs with flexible schema support.
>
> I was wondering if it would be practical to reuse whatever code the
> front-end uses to parse schema descriptions from load statements in
> scripts. Is this a silly idea? If it isn't silly, does anyone know
> where I need to look for that code?
>
>
> On 3 October 2011 22:56, Raghu Angadi <[EMAIL PROTECTED]> wrote:
> > my understanding is that Pig 0.8 expects the first form and Pig 0.9
> requires
> > the second.
> >
> > Raghu.
> >
> > On Mon, Oct 3, 2011 at 8:27 AM, Andrew Clegg
> > <andrew.clegg+[EMAIL PROTECTED]>wrote:
> >
> >> Hi,
> >>
> >> When you have a UDF that returns a bag, and you're writing the
> >> outputSchema method, do you have to explicitly include the mandatory
> >> 'container' tuple within the bag, or is this implicit?
> >>
> >> i.e. if I'm returning a bag of ints, do I have to do:
> >>
> >> return new Schema(
> >>  new FieldSchema(null,
> >>    new Schema(
> >>      new FieldSchema(null, DataType.INTEGER)), DataType.BAG));
> >>
> >> Or do I have to explicitly define a tuple like so:
> >>
> >> return new Schema(
> >>  new FieldSchema(null,
> >>    new Schema(
> >>      new FieldSchema(null,
> >>        new Schema(
> >>          new FieldSchema(null, DataType.INTEGER)), DataType.TUPLE)),
> >> DataType.BAG));
> >>
> >> The docs seem pretty vague on this, and you're allowed to do either.
> >> My feeling would be that if the first form was illegal, you wouldn't
> >> be allowed to create a schema like that, but this may be wishful
> >> thinking.
> >>
> >> Thanks,
> >>
> >> Andrew.
> >>
> >> --
> >>
> >> http://tinyurl.com/andrew-clegg-linkedin |
> http://twitter.com/andrew_clegg
> >>
> >
>
>
>
> --
>
> http://tinyurl.com/andrew-clegg-linkedin | http://twitter.com/andrew_clegg
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB