Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # dev >> Scalar problem


+
Aniket Mokashi 2012-04-08, 07:33
+
Jonathan Coveney 2012-04-08, 14:42
+
Alan Gates 2012-04-09, 17:15
+
Dmitriy Ryaboy 2012-04-09, 17:21
+
Alan Gates 2012-04-09, 17:28
Copy link to this message
-
Re: Scalar problem
We add a cast internally to support the implicit casting right now.

We have following Jiras for this-

https://issues.apache.org/jira/browse/PIG-1967
https://issues.apache.org/jira/browse/PIG-2205
Any more?

Thanks,
Aniket

On Mon, Apr 9, 2012 at 10:28 AM, Alan Gates <[EMAIL PROTECTED]> wrote:

> I was referring to 1 below.  I think making this mandatory instead of
> allowed is sufficient (obviously over time, so we don't break
> computability).
>
> Alan.
>
> On Apr 9, 2012, at 10:21 AM, Dmitriy Ryaboy wrote:
>
> > Alan, which idea are you +1 on? I think (int) D is the current syntax.
> >
> > There are a couple problems that people hit in the current scalar
> > implementation, both of which I think can be fixed without introducing
> > new syntax:
> >
> > 1) Require the cast, don't do it implicitly. This was actually in the
> > design doc but didn't get implemented for some reason.
> >
> > 2) Throw an error on the frontend if the scalar relation is the
> > relation being iterated on. Meaning:
> >
> > foreach foo generate (int) foo.id; -- this will cause the second "foo"
> > to be interpreted as a scalar invocation, although clearly it's just a
> > bug, and the programmer mean to say "generate (int) id"
> >
> > We can just detect this error case and throw during compilation.
> >
> > 3) Improve MR-side logging to make it clear that a relation is being
> > loaded from the side, what the relation is, etc.
> >
> > I believe we have jiras open for all of these..
> >
> > D
> >
> > On Mon, Apr 9, 2012 at 10:15 AM, Alan Gates <[EMAIL PROTECTED]>
> wrote:
> >> I'm +1 on this idea, since it's been a problem since the beginning.
>  Why not use regular casting notation though, rather than develop another
> notation?  That's what we discussed originally when we were deciding
> whether to require casting or do it silently.  So instead of D->a or
> SCALAR(D) it would be (int)D.
> >>
> >> Alan.
> >>
> >> On Apr 8, 2012, at 7:42 AM, Jonathan Coveney wrote:
> >>
> >>> I like this idea, and I think we should deprecate the old syntax, and
> we
> >>> can discuss later when it'd get deleted (and when that would be worth
> it...
> >>> if we have a new syntax, it seems pretty painless to have the other one
> >>> float around for backwards compatibility, and if anyone uses it it's a
> sort
> >>> of "caveat emptor").
> >>>
> >>> 2012/4/8 Aniket Mokashi <[EMAIL PROTECTED]>
> >>>
> >>>> Hi,
> >>>>
> >>>> I have noticed early users of pig often hit issues because of
> confusing
> >>>> syntax between scalars and projections. I think scalar syntax should
> be
> >>>> made more explicit for users to use in order to avoid these problems.
> For
> >>>> example- D = foreach C generate B->count; etc.
> >>>> I am sure we might break some backward compatibility but we can at
> least
> >>>> deprecate the syntax for a few versions and eventually move to new
> syntax.
> >>>>
> >>>> Thoughts?
> >>>>
> >>>> Thanks,
> >>>> Aniket
> >>>>
> >>
>
>
--
"...:::Aniket:::... Quetzalco@tl"
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB