Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> schema of pig flatten


+
Huo Zhu 2012-09-04, 11:16
Copy link to this message
-
Re: schema of pig flatten
You must cast explicitly:

b = foreach a generate (int)foo as foo:int;

Russell Jurney
twitter.com/rjurney
[EMAIL PROTECTED]
datasyndrome.com

On Sep 4, 2012, at 4:17 AM, Huo Zhu <[EMAIL PROTECTED]> wrote:

> i recently meet this problem in my work, it's about pig flatten. i use a
> simple example to express it
>
> two files
> ===file1==> 1_a
> 2_b
> 4_d
>
> ===file2 (tab seperated)==> 1 a
> 2 b
> 3 c
>
> i tried three scripts in pig 0.9 and pig 0.10, and get some exceptions
>
> pig script 1:
>
> a = load 'file1' as (str:chararray);
> b = load 'file2' as (num:int, ch:chararray);
> a1 = foreach a generate flatten(STRSPLIT(str,'_',2)) as (num:int, ch:chararray);
> c = join a1 by num, b by num;
> dump c;   -- exception java.lang.String cannot be cast to java.lang.Integer
>
> pig script 2:
>
> a = load 'file1' as (str:chararray);
> b = load 'file2' as (num:int, ch:chararray);
> a1 = foreach a generate flatten(STRSPLIT(str,'_',2)) as (num:int, ch:chararray);
> a2 = foreach a1 generate (int)num as num, ch as ch;
> c = join a2 by num, b by num;
> dump c;   -- exception java.lang.String cannot be cast to java.lang.Integer
>
> pig script 3:
>
> a = load 'file1' as (str:chararray);
> b = load 'file2' as (num:int, ch:chararray);
> a1 = foreach a generate flatten(STRSPLIT(str,'_',2));
> a2 = foreach a1 generate (int)$0 as num, $1 as ch;
> c = join a2 by num, b by num;
> dump c;   -- right
>
> could somebody explain why script1 and script2 fail but script3 success?
> thanks !
+
Huo Zhu 2012-09-05, 02:08
+
Gianmarco De Francisci Mo... 2012-09-05, 08:22
+
Huo Zhu 2012-09-06, 08:14
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB