Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Fw: Help with Script


+
ingvay7 2012-11-13, 17:57
+
Prashant Kommireddi 2012-11-13, 18:40
+
ingvay7 2012-11-13, 18:49
+
pablomar 2012-11-13, 18:36
Copy link to this message
-
Help with Script
hey all,

Very new Pig user here. I think I'm trying to get something very simple done but getting a few errors. See me script below.Any guidance will be appreciated.Thanks.

I get errors such as  Error during parsing. Invalid alias: serverin {time: double,count: double}
I am basically trying to duplicate the following SQL query:

select Server, Type, Ops, count(*) users, sum(U_tm) , sum(U_cnt)
from TableA
group by 1, 2, 3;

My script is as follows:

a = LOAD 'Report' AS (
dt:chararray,
Server:chararray,
Type:chararray,
Ops:chararray,
UserID:chararray,
U_cnt:int,
U_tm:int,
U_min_tm:int,
U_max_tm:int,
U_avg_tm:float,
);
--Remove Test Servers
remtest = filter a by not Server matches 'Test%';
-- Filter to required columns
reqd = foreach remtest generate $1,$2,$3,$4,$5,$6;
--Groupby
G2 = group reqd by Server,Type,Ops;
--Sum the User Counts and Times
G3 = foreach G2 generate group,SUM(U_tm)as time,SUM(U_cnt)as count;
--byServeroperation = order G3 by Server;
store G3 into 'Servertest';

ingvay7
+
Prashant Kommireddi 2012-11-13, 16:59
+
Vishwanath 2012-11-13, 17:25
+
pablomar 2012-11-13, 16:59