|
|
-
How to get percentage of each group?
MiaoMiao 2012-09-07, 05:23
I have a table, containing userId and userType. userId userType 1 A 2 B 3 C 4 A 5 B 6 B
I want to get percentage of each userType. My current solution: 1. Get count of each group via THRIFT select userType , count(1) from some_table group by userType
2. Calculate each userType using other programming language like PHP.
This solution is fine, but I'm just curious, is there a way to do it in one query? I know this query works in mysql, but not hive. select userType , count(1)/(select count(1) from some_table) from some_table group by userType
-
Re: How to get percentage of each group?
Bertrand Dechoux 2012-09-07, 05:34
Hi,
You could use a cross join. You basically have one table
select userType , count(1) from some_table group by userType
and a second one
select count(1) from some_table
With a cross join you can add the global count to every results in the first table and compute a ratio.
Regards
Bertrand On Fri, Sep 7, 2012 at 7:23 AM, MiaoMiao <[EMAIL PROTECTED]> wrote:
> I have a table, containing userId and userType. > userId userType > 1 A > 2 B > 3 C > 4 A > 5 B > 6 B > > I want to get percentage of each userType. > My current solution: > 1. Get count of each group via THRIFT > select > userType > , count(1) > from > some_table > group by > userType > > 2. Calculate each userType using other programming language like PHP. > > This solution is fine, but I'm just curious, is there a way to do it > in one query? > I know this query works in mysql, but not hive. > select > userType > , count(1)/(select count(1) from some_table) > from > some_table > group by > userType >
-- Bertrand Dechoux
-
Re: How to get percentage of each group?
MiaoMiao 2012-09-07, 06:16
You mean this one? Hive 0.8 doesn't have CROSS JOIN, so I don't know if this query works. SELECT A.userType , A.userType_count/B.global_count FROM ( SELECT userType , COUNT(1) as userType_count FROM some_table GROUP BY userType ) A CROSS JOIN ( SELECT COUNT(1) as global_count FROM some_table ) B; On Fri, Sep 7, 2012 at 1:34 PM, Bertrand Dechoux <[EMAIL PROTECTED]> wrote: > Hi, > > You could use a cross join. > You basically have one table > > select > userType > , count(1) > from > some_table > group by > userType > > and a second one > > select count(1) from some_table > > With a cross join you can add the global count to every results in the first > table and compute a ratio. > > Regards > > Bertrand > > > On Fri, Sep 7, 2012 at 7:23 AM, MiaoMiao <[EMAIL PROTECTED]> wrote: >> >> I have a table, containing userId and userType. >> userId userType >> 1 A >> 2 B >> 3 C >> 4 A >> 5 B >> 6 B >> >> I want to get percentage of each userType. >> My current solution: >> 1. Get count of each group via THRIFT >> select >> userType >> , count(1) >> from >> some_table >> group by >> userType >> >> 2. Calculate each userType using other programming language like PHP. >> >> This solution is fine, but I'm just curious, is there a way to do it >> in one query? >> I know this query works in mysql, but not hive. >> select >> userType >> , count(1)/(select count(1) from some_table) >> from >> some_table >> group by >> userType > > > > > -- > Bertrand Dechoux
-
Re: How to get percentage of each group?
Bejoy KS 2012-09-07, 06:59
Hi
CROSS JOIN is same as giving JOIN keyword. CROSS JOIN just a new notation in later releases of hive. JOIN without ON is same as CROSS JOIN Regards, Bejoy KS ________________________________ From: MiaoMiao <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Sent: Friday, September 7, 2012 11:46 AM Subject: Re: How to get percentage of each group? You mean this one? Hive 0.8 doesn't have CROSS JOIN, so I don't know if this query works. SELECT A.userType , A.userType_count/B.global_count FROM ( SELECT userType , COUNT(1) as userType_count FROM some_table GROUP BY userType ) A CROSS JOIN ( SELECT COUNT(1) as global_count FROM some_table ) B; On Fri, Sep 7, 2012 at 1:34 PM, Bertrand Dechoux <[EMAIL PROTECTED]> wrote: > Hi, > > You could use a cross join. > You basically have one table > > select > userType > , count(1) > from > some_table > group by > userType > > and a second one > > select count(1) from some_table > > With a cross join you can add the global count to every results in the first > table and compute a ratio. > > Regards > > Bertrand > > > On Fri, Sep 7, 2012 at 7:23 AM, MiaoMiao <[EMAIL PROTECTED]> wrote: >> >> I have a table, containing userId and userType. >> userId userType >> 1 A >> 2 B >> 3 C >> 4 A >> 5 B >> 6 B >> >> I want to get percentage of each userType. >> My current solution: >> 1. Get count of each group via THRIFT >> select >> userType >> , count(1) >> from >> some_table >> group by >> userType >> >> 2. Calculate each userType using other programming language like PHP. >> >> This solution is fine, but I'm just curious, is there a way to do it >> in one query? >> I know this query works in mysql, but not hive. >> select >> userType >> , count(1)/(select count(1) from some_table) >> from >> some_table >> group by >> userType > > > > > -- > Bertrand Dechoux
-
Re: How to get percentage of each group?
MiaoMiao 2012-09-07, 07:10
Yeah I figure it out, this query will create 3 jobs, it seems to do sub-queries without any optimization.
SELECT A.userType , A.userType_count/B.global_count FROM ( SELECT userType , COUNT(1) as userType_count FROM some_table GROUP BY userType ) A JOIN ( SELECT COUNT(1) as global_count FROM some_table ) B ON (1=1);
But if all userType are enumerable, I can get their ratio in one job, which seems like a more optimized way.
SELECT SUM(IF(userType='A',1,0))/count(1) as A_ratio , SUM(IF(userType='B',1,0))/count(1) as B_ratio , SUM(IF(userType='C',1,0))/count(1) as C_ratio FROM some_table;
On Fri, Sep 7, 2012 at 2:59 PM, Bejoy KS <[EMAIL PROTECTED]> wrote: > Hi > > CROSS JOIN is same as giving JOIN keyword. CROSS JOIN just a new notation in > later releases of hive. JOIN without ON is same as CROSS JOIN > > Regards, > Bejoy KS > > ________________________________ > From: MiaoMiao <[EMAIL PROTECTED]> > To: [EMAIL PROTECTED] > Sent: Friday, September 7, 2012 11:46 AM > Subject: Re: How to get percentage of each group? > > You mean this one? Hive 0.8 doesn't have CROSS JOIN, so I don't know > if this query works. > SELECT > A.userType > , A.userType_count/B.global_count > FROM > ( > SELECT > userType > , COUNT(1) as userType_count > FROM > some_table > GROUP BY > userType > ) A > CROSS JOIN > ( > SELECT > COUNT(1) as global_count > FROM > some_table > ) B; > On Fri, Sep 7, 2012 at 1:34 PM, Bertrand Dechoux <[EMAIL PROTECTED]> wrote: >> Hi, >> >> You could use a cross join. >> You basically have one table >> >> select >> userType >> , count(1) >> from >> some_table >> group by >> userType >> >> and a second one >> >> select count(1) from some_table >> >> With a cross join you can add the global count to every results in the >> first >> table and compute a ratio. >> >> Regards >> >> Bertrand >> >> >> On Fri, Sep 7, 2012 at 7:23 AM, MiaoMiao <[EMAIL PROTECTED]> wrote: >>> >>> I have a table, containing userId and userType. >>> userId userType >>> 1 A >>> 2 B >>> 3 C >>> 4 A >>> 5 B >>> 6 B >>> >>> I want to get percentage of each userType. >>> My current solution: >>> 1. Get count of each group via THRIFT >>> select >>> userType >>> , count(1) >>> from >>> some_table >>> group by >>> userType >>> >>> 2. Calculate each userType using other programming language like PHP. >>> >>> This solution is fine, but I'm just curious, is there a way to do it >>> in one query? >>> I know this query works in mysql, but not hive. >>> select >>> userType >>> , count(1)/(select count(1) from some_table) >>> from >>> some_table >>> group by >>> userType >> >> >> >> >> -- >> Bertrand Dechoux > >
|
|