Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - Easy question...difference between this::form and this.form?


Copy link to this message
-
RE: Easy question...difference between this::form and this.form?
Santhosh Srinivasan 2010-12-07, 18:11
> The sql way to deal with this issue is essentially to keep the name of the parent relation
> around during parsing, and require that you explicitly provide the desired parent if column
> names are ambiguous. That's probably something that could be implemented now that we have  
> the required metadata in the operators (I believe it wasn't there when the disambiguation
> design was implemented).

Isn't that true today? Unambiguous columns can be referenced without the :: operator.

Santhosh

-----Original Message-----
From: Dmitriy Ryaboy [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, December 07, 2010 9:49 AM
To: [EMAIL PROTECTED]
Subject: Re: Easy question...difference between this::form and this.form?

Consider self-joins, with regards to the meaningful name problem...

The sql way to deal with this issue is essentially to keep the name of the parent relation around during parsing, and require that you explicitly provide the desired parent if column names are ambiguous. That's probably something that could be implemented now that we have the required metadata in the operators (I believe it wasn't there when the disambiguation design was implemented).

As far as difference between "::" and ".".  The double-colon is just a string with no special meaning, it's simply part of the field name. The period is essentially a projection operator -- you are saying, "the thing to the left of the period is a tuple, and the thing to the right is a field in that tuple". (works for bags as well, in which case it means, the thing to the left of the period is a bag of tuples, and the thing to the right is a field in every tuple in the bag)

-Dmitriy.

2010/12/7 Anze <[EMAIL PROTECTED]>

>
> If one uses meaningful names then Pig would never use '::' anyway. The
> problem is when you use multiple joins in sequence, then '::' names
> get very annoying.
> But that's just my opinion. :)
>
> Anze
>
>
> On Tuesday 07 December 2010, Jonathan Coveney wrote:
> > Would that even be much better? It seems like it'd be better to have
> > it
> be
> > consistent in appending the whatever::, so that at least you have to
> > be cognizant of it when you do the join. If it starts being too
> > clever, then it's up to you to figure out when it does and doesn't
> > do it which might
> be
> > annoying.
> >
> > 2010/12/7 Anze <[EMAIL PROTECTED]>
> >
> > > I understand the reason for this, it just seems like a drastic
> solution.
> > > :)
> > >
> > > Ideally, Pig should be clever enough to detect ambiguity and deal
> > > with it, and leave the non-conflicting names intact. For instance:
> > >
> > > A = load 'foo' as (x, y, z);
> > > B = load 'bar' as (x, a, b, c);
> > > C = join A by x, B by x;
> > > DESCRIBE C;
> > > C: {A::x, y, z, B::x, a, b, c}
> > >
> > > or even:
> > > C: {x, y, z, B::x, a, b, c}
> > >
> > > or even a step further, in case of JOIN:
> > > C: {x, y, z, a, b, c}
> > > (since join *joins* by x, why would there be two? This doesn't
> > > always work for other operations, of course)
> > >
> > > Reasoning: at least in my cases the names are descriptive from the
> start,
> > > therefore there are almost no name conflicts. In rare cases where
> > > there are Pig can determine that and use old syntax with "::",
> > > then let me deal with it.
> > >
> > > I know this is backwards-incompatible change and is not likely to
> > > be accepted, but still... :)
> > >
> > > Anze
> > >
> > > On Monday 06 December 2010, Alan Gates wrote:
> > > > The reason it's needed is that ambiguities would result otherwise.
> > > >
> > > > A = load 'foo' as (x, y, z);
> > > > B = load 'bar' as (w, x, y, z);
> > > > C = join A by x, B by x;
> > > > D = filter C by z > 0;  -- which z?
> > > >
> > > > As long as the name is not ambiguous, the :: is not required.  
> > > > So in the above example it would be perfectly legal to say
> > > >
> > > > D = filter C by w > 0;
> > > >
> > > > Out of curiosity, why do you want to remove the :: names?
> > > >
> > > > Alan.
> > > >