Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # dev >> Scalar problem


+
Aniket Mokashi 2012-04-08, 07:33
+
Jonathan Coveney 2012-04-08, 14:42
+
Alan Gates 2012-04-09, 17:15
+
Dmitriy Ryaboy 2012-04-09, 17:21
+
Alan Gates 2012-04-09, 17:28
Copy link to this message
-
Re: Scalar problem
We add a cast internally to support the implicit casting right now.

We have following Jiras for this-

https://issues.apache.org/jira/browse/PIG-1967
https://issues.apache.org/jira/browse/PIG-2205
Any more?

Thanks,
Aniket

On Mon, Apr 9, 2012 at 10:28 AM, Alan Gates <[EMAIL PROTECTED]> wrote:

> I was referring to 1 below.  I think making this mandatory instead of
> allowed is sufficient (obviously over time, so we don't break
> computability).
>
> Alan.
>
> On Apr 9, 2012, at 10:21 AM, Dmitriy Ryaboy wrote:
>
> > Alan, which idea are you +1 on? I think (int) D is the current syntax.
> >
> > There are a couple problems that people hit in the current scalar
> > implementation, both of which I think can be fixed without introducing
> > new syntax:
> >
> > 1) Require the cast, don't do it implicitly. This was actually in the
> > design doc but didn't get implemented for some reason.
> >
> > 2) Throw an error on the frontend if the scalar relation is the
> > relation being iterated on. Meaning:
> >
> > foreach foo generate (int) foo.id; -- this will cause the second "foo"
> > to be interpreted as a scalar invocation, although clearly it's just a
> > bug, and the programmer mean to say "generate (int) id"
> >
> > We can just detect this error case and throw during compilation.
> >
> > 3) Improve MR-side logging to make it clear that a relation is being
> > loaded from the side, what the relation is, etc.
> >
> > I believe we have jiras open for all of these..
> >
> > D
> >
> > On Mon, Apr 9, 2012 at 10:15 AM, Alan Gates <[EMAIL PROTECTED]>
> wrote:
> >> I'm +1 on this idea, since it's been a problem since the beginning.
>  Why not use regular casting notation though, rather than develop another
> notation?  That's what we discussed originally when we were deciding
> whether to require casting or do it silently.  So instead of D->a or
> SCALAR(D) it would be (int)D.
> >>
> >> Alan.
> >>
> >> On Apr 8, 2012, at 7:42 AM, Jonathan Coveney wrote:
> >>
> >>> I like this idea, and I think we should deprecate the old syntax, and
> we
> >>> can discuss later when it'd get deleted (and when that would be worth
> it...
> >>> if we have a new syntax, it seems pretty painless to have the other one
> >>> float around for backwards compatibility, and if anyone uses it it's a
> sort
> >>> of "caveat emptor").
> >>>
> >>> 2012/4/8 Aniket Mokashi <[EMAIL PROTECTED]>
> >>>
> >>>> Hi,
> >>>>
> >>>> I have noticed early users of pig often hit issues because of
> confusing
> >>>> syntax between scalars and projections. I think scalar syntax should
> be
> >>>> made more explicit for users to use in order to avoid these problems.
> For
> >>>> example- D = foreach C generate B->count; etc.
> >>>> I am sure we might break some backward compatibility but we can at
> least
> >>>> deprecate the syntax for a few versions and eventually move to new
> syntax.
> >>>>
> >>>> Thoughts?
> >>>>
> >>>> Thanks,
> >>>> Aniket
> >>>>
> >>
>
>
--
"...:::Aniket:::... Quetzalco@tl"