Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> FWD: Help with Script


Copy link to this message
-
FWD: Help with Script
hey all,

Very new Pig user here. I think I'm trying to get something very simple done but getting a few errors. See me script below.Any guidance will be appreciated.Thanks.

I get errors such as  Error during parsing. Invalid alias: serverin {time: double,count: double}
I am basically trying to duplicate the following SQL query:

select Server, Type, Ops, count(*) users, sum(U_tm) , sum(U_cnt)
from TableA
group by 1, 2, 3

;My script is as follows:

a = LOAD 'Report' AS (
dt:chararray,
Server:chararray,
Type:chararray,
Ops:chararray,
UserID:chararray,
U_cnt:int,
U_tm:int,
U_min_tm:int,
U_max_tm:int,
U_avg_tm:float,
)
;--Remove Test Servers
remtest = filter a by not Server matches 'Test%'
;-- Filter to required columns
reqd = foreach remtest generate $1,$2,$3,$4,$5,$6
;--Groupby
G2 = group reqd by Server,Type,Ops
;--Sum the User Counts and Times
G3 = foreach G2 generate group,SUM(U_tm)as time,SUM(U_cnt)as count
;--byServeroperation = order G3 by Server
;store G3 into 'Servertest'

;ingvay7