|
|
+
Huo Zhu 2012-09-04, 11:16
-
Re: schema of pig flattenRussell Jurney 2012-09-04, 14:04
You must cast explicitly:
b = foreach a generate (int)foo as foo:int; Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] datasyndrome.com On Sep 4, 2012, at 4:17 AM, Huo Zhu <[EMAIL PROTECTED]> wrote: > i recently meet this problem in my work, it's about pig flatten. i use a > simple example to express it > > two files > ===file1==> 1_a > 2_b > 4_d > > ===file2 (tab seperated)==> 1 a > 2 b > 3 c > > i tried three scripts in pig 0.9 and pig 0.10, and get some exceptions > > pig script 1: > > a = load 'file1' as (str:chararray); > b = load 'file2' as (num:int, ch:chararray); > a1 = foreach a generate flatten(STRSPLIT(str,'_',2)) as (num:int, ch:chararray); > c = join a1 by num, b by num; > dump c; -- exception java.lang.String cannot be cast to java.lang.Integer > > pig script 2: > > a = load 'file1' as (str:chararray); > b = load 'file2' as (num:int, ch:chararray); > a1 = foreach a generate flatten(STRSPLIT(str,'_',2)) as (num:int, ch:chararray); > a2 = foreach a1 generate (int)num as num, ch as ch; > c = join a2 by num, b by num; > dump c; -- exception java.lang.String cannot be cast to java.lang.Integer > > pig script 3: > > a = load 'file1' as (str:chararray); > b = load 'file2' as (num:int, ch:chararray); > a1 = foreach a generate flatten(STRSPLIT(str,'_',2)); > a2 = foreach a1 generate (int)$0 as num, $1 as ch; > c = join a2 by num, b by num; > dump c; -- right > > could somebody explain why script1 and script2 fail but script3 success? > thanks ! +
Huo Zhu 2012-09-05, 02:08
+
Gianmarco De Francisci Mo... 2012-09-05, 08:22
+
Huo Zhu 2012-09-06, 08:14
|