Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> STRSPLIT problems (or UDF shortcoming?)


Copy link to this message
-
Re: STRSPLIT problems (or UDF shortcoming?)
Right - this was my point.  Dropping the 'as' clause forces you to use
positional specifiers, which don't seem to have the same issue.  Seems like
this would warrant a JIRA, if only to document the distinction a bit better.

Norbert

On Fri, May 18, 2012 at 1:13 PM, Nerius Landys <[EMAIL PROTECTED]> wrote:

> > From what I can tell, this does seem like a bug.  Switching to positional
> > specifiers seems to work around the issue:
> >
> > TEST = FOREACH MOVEMENT GENERATE $3;
> > POSA = FOREACH TEST GENERATE STRSPLIT($0, '/');
> >
> > Possibly some casting is being applied in one case (positional
> specifiers)
> > but not the other?
>
> Wow I just made a very interesting finding after trying your advice.
> The two sessions below are identical except for lines 2 and 7.  Line 7
> has the "AS startpos:chararray", whereas line 2 has no "AS".
>
> 0. grunt> A = LOAD 'bin-proto-4';
> 1. grunt> MOVEMENT = FILTER A BY (chararray) $0 == 'Movement';
> 2. grunt> TEST = FOREACH MOVEMENT GENERATE $3;
> 3. grunt> POSA = FOREACH TEST GENERATE STRSPLIT($0, '/');
> 4. grunt> DUMP POSA;
> ((1,1))
> ((10,1))
>
> 5. grunt> A = LOAD 'bin-proto-4';
> 6. grunt> MOVEMENT = FILTER A BY (chararray) $0 == 'Movement';
> 7. grunt> TEST = FOREACH MOVEMENT GENERATE $3 AS startpos:chararray;
> 8. grunt> POSA = FOREACH TEST GENERATE STRSPLIT($0, '/');
> 9. grunt> DUMP POSA;
> ()
> ()
>