Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # dev >> Re: Question on the LogicalPlanTester

Copy link to this message
RE: Question on the LogicalPlanTester
Bags have schemas. The schema of the bag is the schema of the tuple
inside the bag. Pi is correct when he says "...doesn't output Bag fields
with tuples wrapped inside but Bag with schema instead"


-----Original Message-----
From: Alan Gates [mailto:[EMAIL PROTECTED]]
Sent: Friday, June 20, 2008 9:21 AM
Subject: Re: Question on the LogicalPlanTester

Comments inlined.

pi song wrote:
> I came across a couple more issues:-
> 1) Currently we don't allow specifying only data type but no alias in
> schema declaration. I work around by using "null" keyword
> null:int, null:long               null means no alias specified
> This is obviously not the right solution. We again need a discussion
> on schema declaration for different cases:-
>  - Specify both type and alias  (Currently supported)
>  - Specify only alias, no type  (Currently supported)
>  - Specify only type, no alias  (Currently not supported)
As specified, there isn't support for giving a field's type without
giving it an alias.  I don't know that we need to allow this.
> 2) Current cogroup implementation doesn't output Bag fields with
> tuples wrapped inside but Bag with schema instead. This is apparently
> inconsistent with schema definition. I don't know which one is right.
> We've discussed about this before but didn't come up with a consensus.

> BTW, a quick way to work around this would be altering(hacking) schema

> loading in LogicalPlanLoader.createLOCogroup()
Santhosh, can you comment on this?  My understanding was that bags could

have schemas too, as that implied that they contained tuples with that
> <snip>