|
|
Rajesh Srinivasan 2012-11-15, 03:31
Hi,
Can you hep me with the syntax of the natural logarithm (base e) of an expression in Pig? According to Help, the syntax is LOG(expression).
I am trying to basically perform the following query:
select server, processor, area, log(server_time)/log(2) as LogGroup, count(*) as users, sum(server_time) as group_time, sum(server_cnt) as group_cnt from Table_reqd group by 1, 2, 3, 4
My script is like: --Load L = LOAD '/user/RS/serverdata1030' AS ( server:chararray, processor:chararray, area:chararray, server_time:int, server_cnt:int, ); --Group After loading data
A = group a by (server,processor,area,(double)LOG((server_time)+1) as LogGroup);
-- Generate Counts and Sums B= foreach A generate group,(long) COUNT(reqd)as Users,(long) SUM(reqd.server_time)as time,(long) SUM(reqd.server_cnt)as count;
Store B into 'data'; The job fails and I get 'ERROR org.apache.pig.tools.grunt.GruntParser - ERROR 2244: Job failed, hadoop does not return any error message.'
I tried changing the datatypes etc but to no avail. Any thoughts on the correct syntax? Thanks.
+
Rajesh Srinivasan 2012-11-15, 03:31
-
Re: Logarithm functions
Prashant Kommireddi 2013-03-07, 08:04
You can do a "log n base 2" operation similar to your SQL query
A = foreach L generate *, LOG(server_time)/LOG(2) as LogGroup; B = group A by (server,processor,area, LogGroup); On Wed, Nov 14, 2012 at 7:31 PM, Rajesh Srinivasan <[EMAIL PROTECTED]>wrote:
> Hi, > > Can you hep me with the syntax of the natural logarithm (base e) of an > expression in Pig? According to Help, the syntax is LOG(expression). > > I am trying to basically perform the following query: > > select server, processor, area, log(server_time)/log(2) as LogGroup, > count(*) as users, sum(server_time) as group_time, sum(server_cnt) as > group_cnt > from Table_reqd > group by 1, 2, 3, 4 > > My script is like: > --Load > L = LOAD '/user/RS/serverdata1030' AS ( > server:chararray, > processor:chararray, > area:chararray, > server_time:int, > server_cnt:int, > > ); > > > --Group After loading data > > A = group a by (server,processor,area,(double)LOG((server_time)+1) as > LogGroup); > > -- Generate Counts and Sums > B= foreach A generate group,(long) COUNT(reqd)as Users,(long) > SUM(reqd.server_time)as time,(long) SUM(reqd.server_cnt)as count; > > Store B into 'data'; > > > The job fails and I get 'ERROR org.apache.pig.tools.grunt.GruntParser - > ERROR 2244: Job failed, hadoop does not return any error message.' > > I tried changing the datatypes etc but to no avail. Any thoughts on the > correct syntax? Thanks. >
+
Prashant Kommireddi 2013-03-07, 08:04
|
|