Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> outputSchema for UDF EvalFunc returning a DataBag


Copy link to this message
-
Re: outputSchema for UDF EvalFunc returning a DataBag
my understanding is that Pig 0.8 expects the first form and Pig 0.9 requires
the second.

Raghu.

On Mon, Oct 3, 2011 at 8:27 AM, Andrew Clegg
<andrew.clegg+[EMAIL PROTECTED]>wrote:

> Hi,
>
> When you have a UDF that returns a bag, and you're writing the
> outputSchema method, do you have to explicitly include the mandatory
> 'container' tuple within the bag, or is this implicit?
>
> i.e. if I'm returning a bag of ints, do I have to do:
>
> return new Schema(
>  new FieldSchema(null,
>    new Schema(
>      new FieldSchema(null, DataType.INTEGER)), DataType.BAG));
>
> Or do I have to explicitly define a tuple like so:
>
> return new Schema(
>  new FieldSchema(null,
>    new Schema(
>      new FieldSchema(null,
>        new Schema(
>          new FieldSchema(null, DataType.INTEGER)), DataType.TUPLE)),
> DataType.BAG));
>
> The docs seem pretty vague on this, and you're allowed to do either.
> My feeling would be that if the first form was illegal, you wouldn't
> be allowed to create a schema like that, but this may be wishful
> thinking.
>
> Thanks,
>
> Andrew.
>
> --
>
> http://tinyurl.com/andrew-clegg-linkedin | http://twitter.com/andrew_clegg
>