|
|
ingvay7 2012-11-13, 17:57
(Apologies for resending but corrected script below) This is the error I got:
ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1045: Could not infer the matching function for org.apache.pig.builtin.SUM as multiple or none of them fit. Please use an explicit cast.
Updated code:
a = LOAD 'Report' AS ( dt:chararray, Server:chararray, Type:chararray, Ops:chararray, UserID:chararray, U_cnt:int, U_tm:int, U_min_tm:int, U_max_tm:int, U_avg_tm:float ); --Remove Test Servers remtest = filter a by not Server matches 'Test%';
-- Filter to required columns reqd = foreach remtest generate $1,$2,$3,$4,$5,$6;
--Groupby G2 = group reqd by (Server,Type,Ops);
--Sum the User Counts and Times G3 = foreach G2 generate group,SUM(U_tm)as time,SUM(U_cnt)as count;
store G3 into 'Servertest';
----- Original Message ----- From: Prashant Kommireddi <[EMAIL PROTECTED]> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> Cc: Sent: Tuesday, November 13, 2012 11:59 AM Subject: Re: Help with Script
Hi,
Can you paste the error message here?
Sent from my iPhone
On Nov 13, 2012, at 8:34 AM, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> wrote:
> hey all, > > Very new Pig user here. I think I'm trying to get something very simple done but getting a few errors. See me script below.Any guidance will be appreciated.Thanks. > > I get errors such as Error during parsing. Invalid alias: serverin {time: double,count: double} > I am basically trying to duplicate the following SQL query: > > select Server, Type, Ops, count(*) users, sum(U_tm) , sum(U_cnt) > from TableA > group by 1, 2, 3; > > > > My script is as follows: > > a = LOAD 'Report' AS ( > dt:chararray, > Server:chararray, > Type:chararray, > Ops:chararray, > UserID:chararray, > U_cnt:int, > U_tm:int, > U_min_tm:int, > U_max_tm:int, > U_avg_tm:float, > ); > > > --Remove Test Servers > remtest = filter a by not Server matches 'Test%'; > -- Filter to required columns > reqd = foreach remtest generate $1,$2,$3,$4,$5,$6; > --Groupby > G2 = group reqd by Server,Type,Ops; > --Sum the User Counts and Times > G3 = foreach G2 generate group,SUM(U_tm)as time,SUM(U_cnt)as count; > --byServeroperation = order G3 by Server; > store G3 into 'Servertest'; > > ingvay7
+
ingvay7 2012-11-13, 17:57
Prashant Kommireddi 2012-11-13, 18:40
SUM function requires that you specify the specific element from the grouping. In this case, U_tm and U_cnt are both within group/bags and need to be accessed as "reqd.U_tm" and "reqd.U_cnt". --Sum the User Counts and Times G3 = foreach G2 generate group,SUM(reqd.U_tm)as time,SUM(reqd.U_cnt)as count;
On Nov 13, 2012, at 9:58 AM, ingvay7 <[EMAIL PROTECTED]> wrote:
(Apologies for resending but corrected script below) This is the error I got:
ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1045: Could not infer the matching function for org.apache.pig.builtin.SUM as multiple or none of them fit. Please use an explicit cast.
Updated code:
a = LOAD 'Report' AS ( dt:chararray, Server:chararray, Type:chararray, Ops:chararray, UserID:chararray, U_cnt:int, U_tm:int, U_min_tm:int, U_max_tm:int, U_avg_tm:float ); --Remove Test Servers remtest = filter a by not Server matches 'Test%';
-- Filter to required columns reqd = foreach remtest generate $1,$2,$3,$4,$5,$6;
--Groupby G2 = group reqd by (Server,Type,Ops);
--Sum the User Counts and Times G3 = foreach G2 generate group,SUM(U_tm)as time,SUM(U_cnt)as count;
store G3 into 'Servertest';
----- Original Message ----- From: Prashant Kommireddi <[EMAIL PROTECTED]> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> Cc: Sent: Tuesday, November 13, 2012 11:59 AM Subject: Re: Help with Script
Hi,
Can you paste the error message here?
Sent from my iPhone
On Nov 13, 2012, at 8:34 AM, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> wrote:
hey all, Very new Pig user here. I think I'm trying to get something very simple done but getting a few errors. See me script below.Any guidance will be appreciated.Thanks. I get errors such as Error during parsing. Invalid alias: serverin {time: double,count: double}
I am basically trying to duplicate the following SQL query: select Server, Type, Ops, count(*) users, sum(U_tm) , sum(U_cnt)
from TableA
group by 1, 2, 3; My script is as follows: a = LOAD 'Report' AS (
dt:chararray,
Server:chararray,
Type:chararray,
Ops:chararray,
UserID:chararray,
U_cnt:int,
U_tm:int,
U_min_tm:int,
U_max_tm:int,
U_avg_tm:float,
);
--Remove Test Servers
remtest = filter a by not Server matches 'Test%';
-- Filter to required columns
reqd = foreach remtest generate $1,$2,$3,$4,$5,$6;
--Groupby
G2 = group reqd by Server,Type,Ops;
--Sum the User Counts and Times
G3 = foreach G2 generate group,SUM(U_tm)as time,SUM(U_cnt)as count;
--byServeroperation = order G3 by Server;
store G3 into 'Servertest'; ingvay7
+
Prashant Kommireddi 2012-11-13, 18:40
ingvay7 2012-11-13, 18:49
Thanks, Prashant and Pablomar. That fixed it! ----- Original Message ----- From: Prashant Kommireddi <[EMAIL PROTECTED]> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> Cc: Sent: Tuesday, November 13, 2012 1:40 PM Subject: Re: Help with Script
SUM function requires that you specify the specific element from the grouping. In this case, U_tm and U_cnt are both within group/bags and need to be accessed as "reqd.U_tm" and "reqd.U_cnt". --Sum the User Counts and Times G3 = foreach G2 generate group,SUM(reqd.U_tm)as time,SUM(reqd.U_cnt)as count;
On Nov 13, 2012, at 9:58 AM, ingvay7 <[EMAIL PROTECTED]> wrote:
(Apologies for resending but corrected script below) This is the error I got:
ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1045: Could not infer the matching function for org.apache.pig.builtin.SUM as multiple or none of them fit. Please use an explicit cast.
Updated code:
a = LOAD 'Report' AS ( dt:chararray, Server:chararray, Type:chararray, Ops:chararray, UserID:chararray, U_cnt:int, U_tm:int, U_min_tm:int, U_max_tm:int, U_avg_tm:float ); --Remove Test Servers remtest = filter a by not Server matches 'Test%';
-- Filter to required columns reqd = foreach remtest generate $1,$2,$3,$4,$5,$6;
--Groupby G2 = group reqd by (Server,Type,Ops);
--Sum the User Counts and Times G3 = foreach G2 generate group,SUM(U_tm)as time,SUM(U_cnt)as count;
store G3 into 'Servertest';
----- Original Message ----- From: Prashant Kommireddi <[EMAIL PROTECTED]> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> Cc: Sent: Tuesday, November 13, 2012 11:59 AM Subject: Re: Help with Script
Hi,
Can you paste the error message here?
Sent from my iPhone
On Nov 13, 2012, at 8:34 AM, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> wrote:
hey all, Very new Pig user here. I think I'm trying to get something very simple done but getting a few errors. See me script below.Any guidance will be appreciated.Thanks. I get errors such as Error during parsing. Invalid alias: serverin {time: double,count: double}
I am basically trying to duplicate the following SQL query: select Server, Type, Ops, count(*) users, sum(U_tm) , sum(U_cnt)
from TableA
group by 1, 2, 3; My script is as follows: a = LOAD 'Report' AS (
dt:chararray,
Server:chararray,
Type:chararray,
Ops:chararray,
UserID:chararray,
U_cnt:int,
U_tm:int,
U_min_tm:int,
U_max_tm:int,
U_avg_tm:float,
);
--Remove Test Servers
remtest = filter a by not Server matches 'Test%';
-- Filter to required columns
reqd = foreach remtest generate $1,$2,$3,$4,$5,$6;
--Groupby
G2 = group reqd by Server,Type,Ops;
--Sum the User Counts and Times
G3 = foreach G2 generate group,SUM(U_tm)as time,SUM(U_cnt)as count;
--byServeroperation = order G3 by Server;
store G3 into 'Servertest'; ingvay7
+
ingvay7 2012-11-13, 18:49
-
Re: Fw: Help with Script
pablomar 2012-11-13, 18:36
what about something like this ?
G3 = foreach G2 generate group,SUM(reqd.U_tm)as time,SUM(reqd.U_cnt)as count;
On Tue, Nov 13, 2012 at 12:57 PM, ingvay7 <[EMAIL PROTECTED]> wrote:
> (Apologies for resending but corrected script below) > > > This is the error I got: > > ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1045: Could not infer the > matching function for org.apache.pig.builtin.SUM as multiple or none of > them fit. Please use an explicit cast. > > Updated code: > > a = LOAD 'Report' AS ( > dt:chararray, > Server:chararray, > Type:chararray, > Ops:chararray, > UserID:chararray, > U_cnt:int, > U_tm:int, > U_min_tm:int, > U_max_tm:int, > U_avg_tm:float > ); > > > --Remove Test Servers > remtest = filter a by not Server matches 'Test%'; > > -- Filter to required columns > reqd = foreach remtest generate $1,$2,$3,$4,$5,$6; > > --Groupby > G2 = group reqd by (Server,Type,Ops); > > --Sum the User Counts and Times > G3 = foreach G2 generate group,SUM(U_tm)as time,SUM(U_cnt)as count; > > store G3 into 'Servertest'; > > > > > > > > ----- Original Message ----- > From: Prashant Kommireddi <[EMAIL PROTECTED]> > To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> > Cc: > Sent: Tuesday, November 13, 2012 11:59 AM > Subject: Re: Help with Script > > Hi, > > Can you paste the error message here? > > Sent from my iPhone > > On Nov 13, 2012, at 8:34 AM, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> > wrote: > > > hey all, > > > > Very new Pig user here. I think I'm trying to get something very simple > done but getting a few errors. See me script below.Any guidance will be > appreciated.Thanks. > > > > I get errors such as Error during parsing. Invalid alias: serverin > {time: double,count: double} > > I am basically trying to duplicate the following SQL query: > > > > select Server, Type, Ops, count(*) users, sum(U_tm) , sum(U_cnt) > > from TableA > > group by 1, 2, 3; > > > > > > > > My script is as follows: > > > > a = LOAD 'Report' AS ( > > dt:chararray, > > Server:chararray, > > Type:chararray, > > Ops:chararray, > > UserID:chararray, > > U_cnt:int, > > U_tm:int, > > U_min_tm:int, > > U_max_tm:int, > > U_avg_tm:float, > > ); > > > > > > --Remove Test Servers > > remtest = filter a by not Server matches 'Test%'; > > -- Filter to required columns > > reqd = foreach remtest generate $1,$2,$3,$4,$5,$6; > > --Groupby > > G2 = group reqd by Server,Type,Ops; > > --Sum the User Counts and Times > > G3 = foreach G2 generate group,SUM(U_tm)as time,SUM(U_cnt)as count; > > --byServeroperation = order G3 by Server; > > store G3 into 'Servertest'; > > > > ingvay7 > >
+
pablomar 2012-11-13, 18:36
ingvay7@...) 2012-11-13, 15:36
hey all,
Very new Pig user here. I think I'm trying to get something very simple done but getting a few errors. See me script below.Any guidance will be appreciated.Thanks.
I get errors such as Error during parsing. Invalid alias: serverin {time: double,count: double} I am basically trying to duplicate the following SQL query:
select Server, Type, Ops, count(*) users, sum(U_tm) , sum(U_cnt) from TableA group by 1, 2, 3;
My script is as follows:
a = LOAD 'Report' AS ( dt:chararray, Server:chararray, Type:chararray, Ops:chararray, UserID:chararray, U_cnt:int, U_tm:int, U_min_tm:int, U_max_tm:int, U_avg_tm:float, ); --Remove Test Servers remtest = filter a by not Server matches 'Test%'; -- Filter to required columns reqd = foreach remtest generate $1,$2,$3,$4,$5,$6; --Groupby G2 = group reqd by Server,Type,Ops; --Sum the User Counts and Times G3 = foreach G2 generate group,SUM(U_tm)as time,SUM(U_cnt)as count; --byServeroperation = order G3 by Server; store G3 into 'Servertest';
ingvay7
+
ingvay7@...) 2012-11-13, 15:36
Prashant Kommireddi 2012-11-13, 16:59
Hi,
Can you paste the error message here?
Sent from my iPhone
On Nov 13, 2012, at 8:34 AM, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> wrote:
> hey all, > > Very new Pig user here. I think I'm trying to get something very simple done but getting a few errors. See me script below.Any guidance will be appreciated.Thanks. > > I get errors such as Error during parsing. Invalid alias: serverin {time: double,count: double} > I am basically trying to duplicate the following SQL query: > > select Server, Type, Ops, count(*) users, sum(U_tm) , sum(U_cnt) > from TableA > group by 1, 2, 3; > > > > My script is as follows: > > a = LOAD 'Report' AS ( > dt:chararray, > Server:chararray, > Type:chararray, > Ops:chararray, > UserID:chararray, > U_cnt:int, > U_tm:int, > U_min_tm:int, > U_max_tm:int, > U_avg_tm:float, > ); > > > --Remove Test Servers > remtest = filter a by not Server matches 'Test%'; > -- Filter to required columns > reqd = foreach remtest generate $1,$2,$3,$4,$5,$6; > --Groupby > G2 = group reqd by Server,Type,Ops; > --Sum the User Counts and Times > G3 = foreach G2 generate group,SUM(U_tm)as time,SUM(U_cnt)as count; > --byServeroperation = order G3 by Server; > store G3 into 'Servertest'; > > ingvay7
+
Prashant Kommireddi 2012-11-13, 16:59
Vishwanath 2012-11-13, 17:25
This is the error I got:
ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1045: Could not infer the matching function for org.apache.pig.builtin.SUM as multiple or none of them fit. Please use an explicit cast.
Updated code:
a = LOAD 'Report' AS ( dt:chararray, Server:chararray, Type:chararray, Ops:chararray, UserID:chararray, U_cnt:int, U_tm:int, U_min_tm:int, U_max_tm:int, U_avg_tm:float ); --Remove Test Servers remtest = filter a by not Server matches 'Test%';
-- Filter to required columns reqd = foreach remtest generate $1,$2,$3,$4,$5,$6;
--Groupby G2 = group reqd by (Server,Type,Ops);
--Sum the User Counts and Times G3 = foreach G2 generate group,SUM(U_tm)as time,SUM(U_cnt)as count;
store G3 into 'Servertest';
----- Original Message ----- From: Prashant Kommireddi <[EMAIL PROTECTED]> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> Cc: Sent: Tuesday, November 13, 2012 11:59 AM Subject: Re: Help with Script
Hi,
Can you paste the error message here?
Sent from my iPhone
On Nov 13, 2012, at 8:34 AM, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> wrote:
> hey all, > > Very new Pig user here. I think I'm trying to get something very simple done but getting a few errors. See me script below.Any guidance will be appreciated.Thanks. > > I get errors such as Error during parsing. Invalid alias: serverin {time: double,count: double} > I am basically trying to duplicate the following SQL query: > > select Server, Type, Ops, count(*) users, sum(U_tm) , sum(U_cnt) > from TableA > group by 1, 2, 3; > > > > My script is as follows: > > a = LOAD 'Report' AS ( > dt:chararray, > Server:chararray, > Type:chararray, > Ops:chararray, > UserID:chararray, > U_cnt:int, > U_tm:int, > U_min_tm:int, > U_max_tm:int, > U_avg_tm:float, > ); > > > --Remove Test Servers > remtest = filter a by not Server matches 'Test%'; > -- Filter to required columns > reqd = foreach remtest generate $1,$2,$3,$4,$5,$6; > --Groupby > G2 = group reqd by Server,Type,Ops; > --Sum the User Counts and Times > G3 = foreach G2 generate group,SUM(U_tm)as time,SUM(U_cnt)as count; > --byServeroperation = order G3 by Server; > store G3 into 'Servertest'; > > ingvay7
+
Vishwanath 2012-11-13, 17:25
pablomar 2012-11-13, 16:59
just taking a quick look, I see a couple of errors: 1_ your LOAD hast one more comma. You need to delete the last one, after U_avg_tm:float
2_ and then, the group by, I think you need parenthesis G2 = group reqd by (Server,Type,Ops);
by the way, where is you alias serverin ?
On Tue, Nov 13, 2012 at 10:36 AM, [EMAIL PROTECTED] <[EMAIL PROTECTED]>wrote:
> hey all, > > Very new Pig user here. I think I'm trying to get something very simple > done but getting a few errors. See me script below.Any guidance will be > appreciated.Thanks. > > I get errors such as Error during parsing. Invalid alias: serverin {time: > double,count: double} > I am basically trying to duplicate the following SQL query: > > select Server, Type, Ops, count(*) users, sum(U_tm) , sum(U_cnt) > from TableA > group by 1, 2, 3; > > > > My script is as follows: > > a = LOAD 'Report' AS ( > dt:chararray, > Server:chararray, > Type:chararray, > Ops:chararray, > UserID:chararray, > U_cnt:int, > U_tm:int, > U_min_tm:int, > U_max_tm:int, > U_avg_tm:float, > ); > > > --Remove Test Servers > remtest = filter a by not Server matches 'Test%'; > -- Filter to required columns > reqd = foreach remtest generate $1,$2,$3,$4,$5,$6; > --Groupby > G2 = group reqd by Server,Type,Ops; > --Sum the User Counts and Times > G3 = foreach G2 generate group,SUM(U_tm)as time,SUM(U_cnt)as count; > --byServeroperation = order G3 by Server; > store G3 into 'Servertest'; > > ingvay7
+
pablomar 2012-11-13, 16:59
|
|