Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - Difference in number of row observstions from distinct and group by


Copy link to this message
-
RE: Difference in number of row observstions from distinct and group by
Mayank Bansal 2013-11-25, 08:06
Hi,

I was also thinking that this might be the case. For that reason I ran this query

Select * from (select col1,col2,col3,count(*)  as val from table_name
group by col1,col2,col3)a where a.val>1 ;

The output that I receive from this query is blank, then I ended up doing count(*) and I got the same number of rows as originally in the table. Please help me figure this out.

Thanks,
Mayank
-----Original Message-----
From: Thejas Nair [mailto:[EMAIL PROTECTED]]
Sent: Friday, November 22, 2013 1:49 AM
To: <[EMAIL PROTECTED]>
Subject: Re: Difference in number of row observstions from distinct and group by

You probably have 400 rows where col1, col2 and col3 have null values.
"count(distinct col1,col2,col3) " will not count those rows.
On Thu, Nov 21, 2013 at 7:13 AM, Mayank Bansal <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>> wrote:
> Hi,
>
>
>
> I have a table which has 3 columns combined together to form a primary key.
> If I do
>
>
>
> Select count(distinct col1,col2,col3) from table_name;
>
>
>
> And
>
>
>
> Select count(a.*) from (select col1,col2,col3,count(*) from table_name
> group by col1,col2,col3)a ;
>
>
>
> While running the first query, the count of rows that I get is 400
> less than what I get by the second query.
>
> Can someone please explain to me the difference in number of
> observations from both the queries?
>
>
>
> Thanks
>
> Mayank
>
> This email message may contain proprietary, private and confidential
> information. The information transmitted is intended only for the
> person(s) or entities to which it is addressed. Any review,
> retransmission, dissemination or other use of, or taking of any action
> in reliance upon, this information by persons or entities other than
> the intended recipient is prohibited and may be illegal. If you
> received this in error, please contact the sender and delete the
> message from your system. Mu Sigma takes all reasonable steps to
> ensure that its electronic communications are free from viruses.
> However, given Internet accessibility, the Company cannot accept
> liability for any virus introduced by this e-mail or any attachment and you are advised to use up-to-date virus checking software.

--
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.

Disclaimer: http://www.mu-sigma.com/disclaimer.html