|
|
-
What is the canonicalname field in a Schema object used for?
Jonathan Coveney 2011-11-17, 03:25
If you do:
Schema s1 = Utils.getSchemaFromString( "b:bag{t:tuple(name:chararray,age:int)}"); then it will all be -1'd out. It doesn't seem to be used anywhere, I was just wondering, since in other case, it will be populated properly. Thanks
Jon
-
Re: What is the canonicalname field in a Schema object used for?
Alan Gates 2011-11-18, 16:58
Santosh is the best person to answer this, as he wrote that code. But, IIRC its purpose is to store the "full" name of a column after cogroups and joins. For example,
A = load 'foo' as (u, v); B = load 'bar' as (x, y); C = join A by u, B by x;
I believe the canonicalname will now hold A::u, etc.
Alan.
On Nov 16, 2011, at 7:25 PM, Jonathan Coveney wrote:
> If you do: > > Schema s1 = Utils.getSchemaFromString( > "b:bag{t:tuple(name:chararray,age:int)}"); > > > then it will all be -1'd out. It doesn't seem to be used anywhere, I was > just wondering, since in other case, it will be populated properly. > > > Thanks > > Jon
-
RE: What is the canonicalname field in a Schema object used for?
Santhosh Srinivasan 2011-11-21, 07:17
Alan is right. Its meant to help with disambiguation when the column name is the same across relations. In Alan's example, if you had u instead of x in B, then the columns in the C (join) would be (A::u, v, B::u, y). A::v and B::y are also valid column names.
Santhosh
-----Original Message----- From: Alan Gates [mailto:[EMAIL PROTECTED]] Sent: Friday, November 18, 2011 8:58 AM To: [EMAIL PROTECTED] Cc: Santhosh Srinivasan Subject: Re: What is the canonicalname field in a Schema object used for?
Santosh is the best person to answer this, as he wrote that code. But, IIRC its purpose is to store the "full" name of a column after cogroups and joins. For example,
A = load 'foo' as (u, v); B = load 'bar' as (x, y); C = join A by u, B by x;
I believe the canonicalname will now hold A::u, etc.
Alan.
On Nov 16, 2011, at 7:25 PM, Jonathan Coveney wrote:
> If you do: > > Schema s1 = Utils.getSchemaFromString( > "b:bag{t:tuple(name:chararray,age:int)}"); > > > then it will all be -1'd out. It doesn't seem to be used anywhere, I > was just wondering, since in other case, it will be populated properly. > > > Thanks > > Jon
-
Re: What is the canonicalname field in a Schema object used for?
Jonathan Coveney 2011-11-21, 07:41
Ah, ok, that makes sense. I only asked because in some UDFs it was being ignored, and that would explain why: they didn't need to be disambiguated in that particular case.
2011/11/20 Santhosh Srinivasan <[EMAIL PROTECTED]>
> Alan is right. Its meant to help with disambiguation when the column name > is the same across relations. In Alan's example, if you had u instead of x > in B, then the columns in the C (join) would be (A::u, v, B::u, y). A::v > and B::y are also valid column names. > > Santhosh > > -----Original Message----- > From: Alan Gates [mailto:[EMAIL PROTECTED]] > Sent: Friday, November 18, 2011 8:58 AM > To: [EMAIL PROTECTED] > Cc: Santhosh Srinivasan > Subject: Re: What is the canonicalname field in a Schema object used for? > > Santosh is the best person to answer this, as he wrote that code. But, > IIRC its purpose is to store the "full" name of a column after cogroups and > joins. For example, > > A = load 'foo' as (u, v); > B = load 'bar' as (x, y); > C = join A by u, B by x; > > I believe the canonicalname will now hold A::u, etc. > > Alan. > > On Nov 16, 2011, at 7:25 PM, Jonathan Coveney wrote: > > > If you do: > > > > Schema s1 = Utils.getSchemaFromString( > > "b:bag{t:tuple(name:chararray,age:int)}"); > > > > > > then it will all be -1'd out. It doesn't seem to be used anywhere, I > > was just wondering, since in other case, it will be populated properly. > > > > > > Thanks > > > > Jon > >
|
|