Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # dev >> What is the canonicalname field in a Schema object used for?


+
Jonathan Coveney 2011-11-17, 03:25
+
Alan Gates 2011-11-18, 16:58
Copy link to this message
-
RE: What is the canonicalname field in a Schema object used for?
Alan is right. Its meant to help with disambiguation when the column name is the same across relations. In Alan's example, if you had u instead of x in B, then the columns in the C (join) would be (A::u, v, B::u, y). A::v and B::y are also valid column names.

Santhosh

-----Original Message-----
From: Alan Gates [mailto:[EMAIL PROTECTED]]
Sent: Friday, November 18, 2011 8:58 AM
To: [EMAIL PROTECTED]
Cc: Santhosh Srinivasan
Subject: Re: What is the canonicalname field in a Schema object used for?

Santosh is the best person to answer this, as he wrote that code.  But, IIRC its purpose is to store the "full" name of a column after cogroups and joins.  For example,

A = load 'foo' as (u, v);
B = load 'bar' as (x, y);
C = join A by u, B by x;

I believe the canonicalname will now hold A::u, etc.

Alan.

On Nov 16, 2011, at 7:25 PM, Jonathan Coveney wrote:

> If you do:
>
> Schema s1 = Utils.getSchemaFromString(
> "b:bag{t:tuple(name:chararray,age:int)}");
>
>
> then it will all be -1'd out. It doesn't seem to be used anywhere, I
> was just wondering, since in other case, it will be populated properly.
>
>
> Thanks
>
> Jon
+
Jonathan Coveney 2011-11-21, 07:41
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB