Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> STRSPLIT problems (or UDF shortcoming?)


Copy link to this message
-
Re: STRSPLIT problems (or UDF shortcoming?)
Right - this was my point.  Dropping the 'as' clause forces you to use
positional specifiers, which don't seem to have the same issue.  Seems like
this would warrant a JIRA, if only to document the distinction a bit better.

Norbert

On Fri, May 18, 2012 at 1:13 PM, Nerius Landys <[EMAIL PROTECTED]> wrote:

> > From what I can tell, this does seem like a bug.  Switching to positional
> > specifiers seems to work around the issue:
> >
> > TEST = FOREACH MOVEMENT GENERATE $3;
> > POSA = FOREACH TEST GENERATE STRSPLIT($0, '/');
> >
> > Possibly some casting is being applied in one case (positional
> specifiers)
> > but not the other?
>
> Wow I just made a very interesting finding after trying your advice.
> The two sessions below are identical except for lines 2 and 7.  Line 7
> has the "AS startpos:chararray", whereas line 2 has no "AS".
>
> 0. grunt> A = LOAD 'bin-proto-4';
> 1. grunt> MOVEMENT = FILTER A BY (chararray) $0 == 'Movement';
> 2. grunt> TEST = FOREACH MOVEMENT GENERATE $3;
> 3. grunt> POSA = FOREACH TEST GENERATE STRSPLIT($0, '/');
> 4. grunt> DUMP POSA;
> ((1,1))
> ((10,1))
>
> 5. grunt> A = LOAD 'bin-proto-4';
> 6. grunt> MOVEMENT = FILTER A BY (chararray) $0 == 'Movement';
> 7. grunt> TEST = FOREACH MOVEMENT GENERATE $3 AS startpos:chararray;
> 8. grunt> POSA = FOREACH TEST GENERATE STRSPLIT($0, '/');
> 9. grunt> DUMP POSA;
> ()
> ()
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB