|
ingvay7@...)
2012-11-13, 15:36
Prashant Kommireddi
2012-11-13, 16:59
pablomar
2012-11-13, 16:59
Vishwanath
2012-11-13, 17:25
ingvay7
2012-11-13, 17:57
pablomar
2012-11-13, 18:36
Prashant Kommireddi
2012-11-13, 18:40
ingvay7
2012-11-13, 18:49
|
-
Help with Scriptingvay7@...) 2012-11-13, 15:36
hey all,
Very new Pig user here. I think I'm trying to get something very simple done but getting a few errors. See me script below.Any guidance will be appreciated.Thanks. I get errors such as Error during parsing. Invalid alias: serverin {time: double,count: double} I am basically trying to duplicate the following SQL query: select Server, Type, Ops, count(*) users, sum(U_tm) , sum(U_cnt) from TableA group by 1, 2, 3; My script is as follows: a = LOAD 'Report' AS ( dt:chararray, Server:chararray, Type:chararray, Ops:chararray, UserID:chararray, U_cnt:int, U_tm:int, U_min_tm:int, U_max_tm:int, U_avg_tm:float, ); --Remove Test Servers remtest = filter a by not Server matches 'Test%'; -- Filter to required columns reqd = foreach remtest generate $1,$2,$3,$4,$5,$6; --Groupby G2 = group reqd by Server,Type,Ops; --Sum the User Counts and Times G3 = foreach G2 generate group,SUM(U_tm)as time,SUM(U_cnt)as count; --byServeroperation = order G3 by Server; store G3 into 'Servertest'; ingvay7
-
Re: Help with ScriptPrashant Kommireddi 2012-11-13, 16:59
Hi,
Can you paste the error message here? Sent from my iPhone On Nov 13, 2012, at 8:34 AM, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> wrote: > hey all, > > Very new Pig user here. I think I'm trying to get something very simple done but getting a few errors. See me script below.Any guidance will be appreciated.Thanks. > > I get errors such as Error during parsing. Invalid alias: serverin {time: double,count: double} > I am basically trying to duplicate the following SQL query: > > select Server, Type, Ops, count(*) users, sum(U_tm) , sum(U_cnt) > from TableA > group by 1, 2, 3; > > > > My script is as follows: > > a = LOAD 'Report' AS ( > dt:chararray, > Server:chararray, > Type:chararray, > Ops:chararray, > UserID:chararray, > U_cnt:int, > U_tm:int, > U_min_tm:int, > U_max_tm:int, > U_avg_tm:float, > ); > > > --Remove Test Servers > remtest = filter a by not Server matches 'Test%'; > -- Filter to required columns > reqd = foreach remtest generate $1,$2,$3,$4,$5,$6; > --Groupby > G2 = group reqd by Server,Type,Ops; > --Sum the User Counts and Times > G3 = foreach G2 generate group,SUM(U_tm)as time,SUM(U_cnt)as count; > --byServeroperation = order G3 by Server; > store G3 into 'Servertest'; > > ingvay7
-
Re: Help with Scriptpablomar 2012-11-13, 16:59
just taking a quick look, I see a couple of errors:
1_ your LOAD hast one more comma. You need to delete the last one, after U_avg_tm:float 2_ and then, the group by, I think you need parenthesis G2 = group reqd by (Server,Type,Ops); by the way, where is you alias serverin ? On Tue, Nov 13, 2012 at 10:36 AM, [EMAIL PROTECTED] <[EMAIL PROTECTED]>wrote: > hey all, > > Very new Pig user here. I think I'm trying to get something very simple > done but getting a few errors. See me script below.Any guidance will be > appreciated.Thanks. > > I get errors such as Error during parsing. Invalid alias: serverin {time: > double,count: double} > I am basically trying to duplicate the following SQL query: > > select Server, Type, Ops, count(*) users, sum(U_tm) , sum(U_cnt) > from TableA > group by 1, 2, 3; > > > > My script is as follows: > > a = LOAD 'Report' AS ( > dt:chararray, > Server:chararray, > Type:chararray, > Ops:chararray, > UserID:chararray, > U_cnt:int, > U_tm:int, > U_min_tm:int, > U_max_tm:int, > U_avg_tm:float, > ); > > > --Remove Test Servers > remtest = filter a by not Server matches 'Test%'; > -- Filter to required columns > reqd = foreach remtest generate $1,$2,$3,$4,$5,$6; > --Groupby > G2 = group reqd by Server,Type,Ops; > --Sum the User Counts and Times > G3 = foreach G2 generate group,SUM(U_tm)as time,SUM(U_cnt)as count; > --byServeroperation = order G3 by Server; > store G3 into 'Servertest'; > > ingvay7
-
Re: Help with ScriptVishwanath 2012-11-13, 17:25
This is the error I got:
ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1045: Could not infer the matching function for org.apache.pig.builtin.SUM as multiple or none of them fit. Please use an explicit cast. Updated code: a = LOAD 'Report' AS ( dt:chararray, Server:chararray, Type:chararray, Ops:chararray, UserID:chararray, U_cnt:int, U_tm:int, U_min_tm:int, U_max_tm:int, U_avg_tm:float ); --Remove Test Servers remtest = filter a by not Server matches 'Test%'; -- Filter to required columns reqd = foreach remtest generate $1,$2,$3,$4,$5,$6; --Groupby G2 = group reqd by (Server,Type,Ops); --Sum the User Counts and Times G3 = foreach G2 generate group,SUM(U_tm)as time,SUM(U_cnt)as count; store G3 into 'Servertest'; ----- Original Message ----- From: Prashant Kommireddi <[EMAIL PROTECTED]> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> Cc: Sent: Tuesday, November 13, 2012 11:59 AM Subject: Re: Help with Script Hi, Can you paste the error message here? Sent from my iPhone On Nov 13, 2012, at 8:34 AM, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> wrote: > hey all, > > Very new Pig user here. I think I'm trying to get something very simple done but getting a few errors. See me script below.Any guidance will be appreciated.Thanks. > > I get errors such as Error during parsing. Invalid alias: serverin {time: double,count: double} > I am basically trying to duplicate the following SQL query: > > select Server, Type, Ops, count(*) users, sum(U_tm) , sum(U_cnt) > from TableA > group by 1, 2, 3; > > > > My script is as follows: > > a = LOAD 'Report' AS ( > dt:chararray, > Server:chararray, > Type:chararray, > Ops:chararray, > UserID:chararray, > U_cnt:int, > U_tm:int, > U_min_tm:int, > U_max_tm:int, > U_avg_tm:float, > ); > > > --Remove Test Servers > remtest = filter a by not Server matches 'Test%'; > -- Filter to required columns > reqd = foreach remtest generate $1,$2,$3,$4,$5,$6; > --Groupby > G2 = group reqd by Server,Type,Ops; > --Sum the User Counts and Times > G3 = foreach G2 generate group,SUM(U_tm)as time,SUM(U_cnt)as count; > --byServeroperation = order G3 by Server; > store G3 into 'Servertest'; > > ingvay7
-
Fw: Help with Scriptingvay7 2012-11-13, 17:57
(Apologies for resending but corrected script below)
This is the error I got: ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1045: Could not infer the matching function for org.apache.pig.builtin.SUM as multiple or none of them fit. Please use an explicit cast. Updated code: a = LOAD 'Report' AS ( dt:chararray, Server:chararray, Type:chararray, Ops:chararray, UserID:chararray, U_cnt:int, U_tm:int, U_min_tm:int, U_max_tm:int, U_avg_tm:float ); --Remove Test Servers remtest = filter a by not Server matches 'Test%'; -- Filter to required columns reqd = foreach remtest generate $1,$2,$3,$4,$5,$6; --Groupby G2 = group reqd by (Server,Type,Ops); --Sum the User Counts and Times G3 = foreach G2 generate group,SUM(U_tm)as time,SUM(U_cnt)as count; store G3 into 'Servertest'; ----- Original Message ----- From: Prashant Kommireddi <[EMAIL PROTECTED]> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> Cc: Sent: Tuesday, November 13, 2012 11:59 AM Subject: Re: Help with Script Hi, Can you paste the error message here? Sent from my iPhone On Nov 13, 2012, at 8:34 AM, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> wrote: > hey all, > > Very new Pig user here. I think I'm trying to get something very simple done but getting a few errors. See me script below.Any guidance will be appreciated.Thanks. > > I get errors such as Error during parsing. Invalid alias: serverin {time: double,count: double} > I am basically trying to duplicate the following SQL query: > > select Server, Type, Ops, count(*) users, sum(U_tm) , sum(U_cnt) > from TableA > group by 1, 2, 3; > > > > My script is as follows: > > a = LOAD 'Report' AS ( > dt:chararray, > Server:chararray, > Type:chararray, > Ops:chararray, > UserID:chararray, > U_cnt:int, > U_tm:int, > U_min_tm:int, > U_max_tm:int, > U_avg_tm:float, > ); > > > --Remove Test Servers > remtest = filter a by not Server matches 'Test%'; > -- Filter to required columns > reqd = foreach remtest generate $1,$2,$3,$4,$5,$6; > --Groupby > G2 = group reqd by Server,Type,Ops; > --Sum the User Counts and Times > G3 = foreach G2 generate group,SUM(U_tm)as time,SUM(U_cnt)as count; > --byServeroperation = order G3 by Server; > store G3 into 'Servertest'; > > ingvay7
-
Re: Fw: Help with Scriptpablomar 2012-11-13, 18:36
what about something like this ?
G3 = foreach G2 generate group,SUM(reqd.U_tm)as time,SUM(reqd.U_cnt)as count; On Tue, Nov 13, 2012 at 12:57 PM, ingvay7 <[EMAIL PROTECTED]> wrote: > (Apologies for resending but corrected script below) > > > This is the error I got: > > ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1045: Could not infer the > matching function for org.apache.pig.builtin.SUM as multiple or none of > them fit. Please use an explicit cast. > > Updated code: > > a = LOAD 'Report' AS ( > dt:chararray, > Server:chararray, > Type:chararray, > Ops:chararray, > UserID:chararray, > U_cnt:int, > U_tm:int, > U_min_tm:int, > U_max_tm:int, > U_avg_tm:float > ); > > > --Remove Test Servers > remtest = filter a by not Server matches 'Test%'; > > -- Filter to required columns > reqd = foreach remtest generate $1,$2,$3,$4,$5,$6; > > --Groupby > G2 = group reqd by (Server,Type,Ops); > > --Sum the User Counts and Times > G3 = foreach G2 generate group,SUM(U_tm)as time,SUM(U_cnt)as count; > > store G3 into 'Servertest'; > > > > > > > > ----- Original Message ----- > From: Prashant Kommireddi <[EMAIL PROTECTED]> > To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> > Cc: > Sent: Tuesday, November 13, 2012 11:59 AM > Subject: Re: Help with Script > > Hi, > > Can you paste the error message here? > > Sent from my iPhone > > On Nov 13, 2012, at 8:34 AM, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> > wrote: > > > hey all, > > > > Very new Pig user here. I think I'm trying to get something very simple > done but getting a few errors. See me script below.Any guidance will be > appreciated.Thanks. > > > > I get errors such as Error during parsing. Invalid alias: serverin > {time: double,count: double} > > I am basically trying to duplicate the following SQL query: > > > > select Server, Type, Ops, count(*) users, sum(U_tm) , sum(U_cnt) > > from TableA > > group by 1, 2, 3; > > > > > > > > My script is as follows: > > > > a = LOAD 'Report' AS ( > > dt:chararray, > > Server:chararray, > > Type:chararray, > > Ops:chararray, > > UserID:chararray, > > U_cnt:int, > > U_tm:int, > > U_min_tm:int, > > U_max_tm:int, > > U_avg_tm:float, > > ); > > > > > > --Remove Test Servers > > remtest = filter a by not Server matches 'Test%'; > > -- Filter to required columns > > reqd = foreach remtest generate $1,$2,$3,$4,$5,$6; > > --Groupby > > G2 = group reqd by Server,Type,Ops; > > --Sum the User Counts and Times > > G3 = foreach G2 generate group,SUM(U_tm)as time,SUM(U_cnt)as count; > > --byServeroperation = order G3 by Server; > > store G3 into 'Servertest'; > > > > ingvay7 > >
-
Re: Help with ScriptPrashant Kommireddi 2012-11-13, 18:40
SUM function requires that you specify the specific element from the
grouping. In this case, U_tm and U_cnt are both within group/bags and need to be accessed as "reqd.U_tm" and "reqd.U_cnt". --Sum the User Counts and Times G3 = foreach G2 generate group,SUM(reqd.U_tm)as time,SUM(reqd.U_cnt)as count; On Nov 13, 2012, at 9:58 AM, ingvay7 <[EMAIL PROTECTED]> wrote: (Apologies for resending but corrected script below) This is the error I got: ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1045: Could not infer the matching function for org.apache.pig.builtin.SUM as multiple or none of them fit. Please use an explicit cast. Updated code: a = LOAD 'Report' AS ( dt:chararray, Server:chararray, Type:chararray, Ops:chararray, UserID:chararray, U_cnt:int, U_tm:int, U_min_tm:int, U_max_tm:int, U_avg_tm:float ); --Remove Test Servers remtest = filter a by not Server matches 'Test%'; -- Filter to required columns reqd = foreach remtest generate $1,$2,$3,$4,$5,$6; --Groupby G2 = group reqd by (Server,Type,Ops); --Sum the User Counts and Times G3 = foreach G2 generate group,SUM(U_tm)as time,SUM(U_cnt)as count; store G3 into 'Servertest'; ----- Original Message ----- From: Prashant Kommireddi <[EMAIL PROTECTED]> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> Cc: Sent: Tuesday, November 13, 2012 11:59 AM Subject: Re: Help with Script Hi, Can you paste the error message here? Sent from my iPhone On Nov 13, 2012, at 8:34 AM, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> wrote: hey all, Very new Pig user here. I think I'm trying to get something very simple done but getting a few errors. See me script below.Any guidance will be appreciated.Thanks. I get errors such as Error during parsing. Invalid alias: serverin {time: double,count: double} I am basically trying to duplicate the following SQL query: select Server, Type, Ops, count(*) users, sum(U_tm) , sum(U_cnt) from TableA group by 1, 2, 3; My script is as follows: a = LOAD 'Report' AS ( dt:chararray, Server:chararray, Type:chararray, Ops:chararray, UserID:chararray, U_cnt:int, U_tm:int, U_min_tm:int, U_max_tm:int, U_avg_tm:float, ); --Remove Test Servers remtest = filter a by not Server matches 'Test%'; -- Filter to required columns reqd = foreach remtest generate $1,$2,$3,$4,$5,$6; --Groupby G2 = group reqd by Server,Type,Ops; --Sum the User Counts and Times G3 = foreach G2 generate group,SUM(U_tm)as time,SUM(U_cnt)as count; --byServeroperation = order G3 by Server; store G3 into 'Servertest'; ingvay7
-
Re: Help with Scriptingvay7 2012-11-13, 18:49
Thanks, Prashant and Pablomar. That fixed it!
----- Original Message ----- From: Prashant Kommireddi <[EMAIL PROTECTED]> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> Cc: Sent: Tuesday, November 13, 2012 1:40 PM Subject: Re: Help with Script SUM function requires that you specify the specific element from the grouping. In this case, U_tm and U_cnt are both within group/bags and need to be accessed as "reqd.U_tm" and "reqd.U_cnt". --Sum the User Counts and Times G3 = foreach G2 generate group,SUM(reqd.U_tm)as time,SUM(reqd.U_cnt)as count; On Nov 13, 2012, at 9:58 AM, ingvay7 <[EMAIL PROTECTED]> wrote: (Apologies for resending but corrected script below) This is the error I got: ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1045: Could not infer the matching function for org.apache.pig.builtin.SUM as multiple or none of them fit. Please use an explicit cast. Updated code: a = LOAD 'Report' AS ( dt:chararray, Server:chararray, Type:chararray, Ops:chararray, UserID:chararray, U_cnt:int, U_tm:int, U_min_tm:int, U_max_tm:int, U_avg_tm:float ); --Remove Test Servers remtest = filter a by not Server matches 'Test%'; -- Filter to required columns reqd = foreach remtest generate $1,$2,$3,$4,$5,$6; --Groupby G2 = group reqd by (Server,Type,Ops); --Sum the User Counts and Times G3 = foreach G2 generate group,SUM(U_tm)as time,SUM(U_cnt)as count; store G3 into 'Servertest'; ----- Original Message ----- From: Prashant Kommireddi <[EMAIL PROTECTED]> To: "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> Cc: Sent: Tuesday, November 13, 2012 11:59 AM Subject: Re: Help with Script Hi, Can you paste the error message here? Sent from my iPhone On Nov 13, 2012, at 8:34 AM, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> wrote: hey all, Very new Pig user here. I think I'm trying to get something very simple done but getting a few errors. See me script below.Any guidance will be appreciated.Thanks. I get errors such as Error during parsing. Invalid alias: serverin {time: double,count: double} I am basically trying to duplicate the following SQL query: select Server, Type, Ops, count(*) users, sum(U_tm) , sum(U_cnt) from TableA group by 1, 2, 3; My script is as follows: a = LOAD 'Report' AS ( dt:chararray, Server:chararray, Type:chararray, Ops:chararray, UserID:chararray, U_cnt:int, U_tm:int, U_min_tm:int, U_max_tm:int, U_avg_tm:float, ); --Remove Test Servers remtest = filter a by not Server matches 'Test%'; -- Filter to required columns reqd = foreach remtest generate $1,$2,$3,$4,$5,$6; --Groupby G2 = group reqd by Server,Type,Ops; --Sum the User Counts and Times G3 = foreach G2 generate group,SUM(U_tm)as time,SUM(U_cnt)as count; --byServeroperation = order G3 by Server; store G3 into 'Servertest'; ingvay7 |