Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - outputSchema for UDF EvalFunc returning a DataBag


Copy link to this message
-
Re: outputSchema for UDF EvalFunc returning a DataBag
Raghu Angadi 2011-10-03, 23:14
Utils.getSchemaFromString() seems like exactly what you want (
from org_apache_pig_impl_util ).

Raghu.

[btw. my two previous attempts to send to the list got rejected as spam ]

On Mon, Oct 3, 2011 at 3:41 PM, Andrew Clegg
<andrew.clegg+[EMAIL PROTECTED]>wrote:

> Thanks Raghu (and Dmitry).
>
> Could this maybe get added to the docs page on UDFs? (Apologies if
> it's there already and I missed it.)
>
> Also -- it's a bit cumbersome writing all these nested Schema and
> FieldSchema constructors, especially when you're writing tests for
> UDFs with flexible schema support.
>
> I was wondering if it would be practical to reuse whatever code the
> front-end uses to parse schema descriptions from load statements in
> scripts. Is this a silly idea? If it isn't silly, does anyone know
> where I need to look for that code?
>
>
> On 3 October 2011 22:56, Raghu Angadi <[EMAIL PROTECTED]> wrote:
> > my understanding is that Pig 0.8 expects the first form and Pig 0.9
> requires
> > the second.
> >
> > Raghu.
> >
> > On Mon, Oct 3, 2011 at 8:27 AM, Andrew Clegg
> > <andrew.clegg+[EMAIL PROTECTED]>wrote:
> >
> >> Hi,
> >>
> >> When you have a UDF that returns a bag, and you're writing the
> >> outputSchema method, do you have to explicitly include the mandatory
> >> 'container' tuple within the bag, or is this implicit?
> >>
> >> i.e. if I'm returning a bag of ints, do I have to do:
> >>
> >> return new Schema(
> >>  new FieldSchema(null,
> >>    new Schema(
> >>      new FieldSchema(null, DataType.INTEGER)), DataType.BAG));
> >>
> >> Or do I have to explicitly define a tuple like so:
> >>
> >> return new Schema(
> >>  new FieldSchema(null,
> >>    new Schema(
> >>      new FieldSchema(null,
> >>        new Schema(
> >>          new FieldSchema(null, DataType.INTEGER)), DataType.TUPLE)),
> >> DataType.BAG));
> >>
> >> The docs seem pretty vague on this, and you're allowed to do either.
> >> My feeling would be that if the first form was illegal, you wouldn't
> >> be allowed to create a schema like that, but this may be wishful
> >> thinking.
> >>
> >> Thanks,
> >>
> >> Andrew.
> >>
> >> --
> >>
> >> http://tinyurl.com/andrew-clegg-linkedin |
> http://twitter.com/andrew_clegg
> >>
> >
>
>
>
> --
>
> http://tinyurl.com/andrew-clegg-linkedin | http://twitter.com/andrew_clegg
>