Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> STRSPLIT problems (or UDF shortcoming?)


+
Nerius Landys 2012-05-17, 17:57
+
Dan Young 2012-05-17, 22:42
+
Nerius Landys 2012-05-17, 23:02
+
Dan Young 2012-05-17, 23:06
+
Nerius Landys 2012-05-17, 23:12
+
Dan Young 2012-05-17, 23:24
+
Nerius Landys 2012-05-17, 23:26
+
Dan Young 2012-05-17, 23:37
+
Dan Young 2012-05-17, 23:30
+
Nerius Landys 2012-05-17, 23:39
Copy link to this message
-
Re: STRSPLIT problems (or UDF shortcoming?)
>From what I can tell, this does seem like a bug.  Switching to positional
specifiers seems to work around the issue:

TEST = FOREACH MOVEMENT GENERATE $3;
POSA = FOREACH TEST GENERATE STRSPLIT($0, '/');

Possibly some casting is being applied in one case (positional specifiers)
but not the other?

Norbert

On Thu, May 17, 2012 at 7:39 PM, Nerius Landys <[EMAIL PROTECTED]> wrote:

> > We ended up using 0.10 on EMR and its been working fine so far...
>
> OK a bit of bad news.  0.10 did not fix my problem.
> I'll recap the entire situation.
> HADOOP_HOME is set to hadoop-0.20.205.0, Pig version is now pig-0.10.0.
>
> File 'bin-proto-4' is:
>
> Meta    1234567890      foo     34
> Movement        1234567890      Rambetter       1/1     2/3
> Movement        1234567890      Freddyman       10/1    10/2
>
> (with tab delimiters)
>
> grunt> A = LOAD 'bin-proto-4';
> grunt> MOVEMENT = FILTER A BY (chararray) $0 == 'Movement';
> grunt> TEST = FOREACH MOVEMENT GENERATE $3 AS startpos:chararray;
> grunt> POSA = FOREACH TEST GENERATE STRSPLIT(startpos,'/');
> grunt> DUMP POSA;
> ()
> ()
>
> grunt> DUMP TEST;
> (1/1)
> (10/1)
>
> Ran this on my local machine just now.
>
+
Nerius Landys 2012-05-18, 17:13
+
Norbert Burger 2012-05-19, 00:02
+
Nerius Landys 2012-05-19, 02:59
+
krishnan N 2012-05-17, 19:44
+
Nerius Landys 2012-05-17, 21:36
+
Ranjith 2012-05-17, 22:40
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB