Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> date treatment & date level aggregations


Copy link to this message
-
RE: date treatment & date level aggregations
Hi Avram,

A few things to note:

1. The builtin functions in Pig are Java UDFs, making them case
sensitive. You should use TOKENIZE instead of tokenize
2. It looks like the builtin TOKENIZE has to be fixed to support your
current usage. I have a filed a bug report to track this : PIG-683
(https://issues.apache.org/jira/browse/PIG-683)

When PIG-683 is fixed, you should then be able to do the following:
A = load 'atest.csv' using PigStorage(',') as (v1,v2,v3,v4);
B = foreach A generate flatten(TOKENIZE(v2)) as (date,time), v3;
C = foreach B generate date, v3;
D = group C by date;
E = foreach D generate group, SUM(C.v3);
dump E;

Thanks,
Santhosh

-----Original Message-----
From: Avram Aelony [mailto:[EMAIL PROTECTED]]
Sent: Thursday, February 19, 2009 10:59 AM
To: [EMAIL PROTECTED]
Subject: RE: date treatment & date level aggregations

I tried the capitalized version, that still leads to an error. Now it
appears to be a problem with the alias.

grunt> B = foreach A generate TOKENIZE(A.v2) as (date,time), v3;
2009-02-19 10:56:05,075 [main] ERROR
org.apache.pig.tools.grunt.GruntParser - java.io.IOException: Invalid
alias: A in A: (v1, v2, v3, v4 )
        at org.apache.pig.PigServer.registerQuery(PigServer.java:278)
        at
org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:475)
        at
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptPar
ser.java:233)
        at
org.apache.pig.tools.grunt.GruntParser.parseContOnError(GruntParser.java
:91)
        at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:54)
        at org.apache.pig.Main.main(Main.java:270)
Caused by: org.apache.pig.impl.logicalLayer.parser.ParseException:
Invalid alias: A in A: (v1, v2, v3, v4 )
        at
org.apache.pig.impl.logicalLayer.parser.QueryParser.AliasFieldOrSpec(Que
ryParser.java:3301)
        at
org.apache.pig.impl.logicalLayer.parser.QueryParser.ColOrSpec(QueryParse
r.java:3225)
        at
org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseEvalSpec(QueryPa
rser.java:2236)
        at
org.apache.pig.impl.logicalLayer.parser.QueryParser.UnaryExpr(QueryParse
r.java:2175)
        at
org.apache.pig.impl.logicalLayer.parser.QueryParser.MultiplicativeExpr(Q
ueryParser.java:2106)
        at
org.apache.pig.impl.logicalLayer.parser.QueryParser.AdditiveExpr(QueryPa
rser.java:2038)
        at
org.apache.pig.impl.logicalLayer.parser.QueryParser.InfixExpr(QueryParse
r.java:2006)
        at
org.apache.pig.impl.logicalLayer.parser.QueryParser.EvalArgsItem(QueryPa
rser.java:2456)
        at
org.apache.pig.impl.logicalLayer.parser.QueryParser.EvalArgs(QueryParser
.java:2397)
        at
org.apache.pig.impl.logicalLayer.parser.QueryParser.FuncEvalSpec(QueryPa
rser.java:2356)
        at
org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseEvalSpec(QueryPa
rser.java:2230)
        at
org.apache.pig.impl.logicalLayer.parser.QueryParser.UnaryExpr(QueryParse
r.java:2175)
        at
org.apache.pig.impl.logicalLayer.parser.QueryParser.MultiplicativeExpr(Q
ueryParser.java:2106)
        at
org.apache.pig.impl.logicalLayer.parser.QueryParser.AdditiveExpr(QueryPa
rser.java:2038)
        at
org.apache.pig.impl.logicalLayer.parser.QueryParser.InfixExpr(QueryParse
r.java:2006)
        at
org.apache.pig.impl.logicalLayer.parser.QueryParser.FlattenedGenerateIte
m(QueryParser.java:1955)
        at
org.apache.pig.impl.logicalLayer.parser.QueryParser.FlattenedGenerateIte
mList(QueryParser.java:1894)
        at
org.apache.pig.impl.logicalLayer.parser.QueryParser.GenerateStatement(Qu
eryParser.java:1862)
        at
org.apache.pig.impl.logicalLayer.parser.QueryParser.NestedBlock(QueryPar
ser.java:1604)
        at
org.apache.pig.impl.logicalLayer.parser.QueryParser.ForEachClause(QueryP
arser.java:1569)
        at
org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser
.java:711)
        at
org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.jav
a:512)
        at
org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.ja
va:362)
        at
org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBui
lder.java:47)
        at org.apache.pig.PigServer.registerQuery(PigServer.java:275)
        ... 5 more
From: Olga Natkovich [mailto:[EMAIL PROTECTED]]
Sent: Thursday, February 19, 2009 10:54 AM
To: [EMAIL PROTECTED]
Subject: RE: date treatment & date level aggregations

Functions in pig are case sensitive. The function name is TOKENIZE.
Please, refer to PigLatin Manula for details:
http://wiki.apache.org/pig-data/attachments/FrontPage/attachments/plrm.h
tm.

Olga