Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Difference in number of row observstions from distinct and group by

Copy link to this message
Difference in number of row observstions from distinct and group by

I have a table which has 3 columns combined together to form a primary key. If I do

Select count(distinct col1,col2,col3) from table_name;


Select count(a.*) from (select col1,col2,col3,count(*) from table_name group by col1,col2,col3)a ;

While running the first query, the count of rows that I get is 400 less than what I get by the second query.
Can someone please explain to me the difference in number of observations from both the queries?

This email message may contain proprietary, private and confidential information. The information transmitted is intended only for the person(s) or entities to which it is addressed. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited and may be illegal. If you received this in error, please contact the sender and delete the message from your system. Mu Sigma takes all reasonable steps to ensure that its electronic communications are free from viruses. However, given Internet accessibility, the Company cannot accept liability for any virus introduced by this e-mail or any attachment and you are advised to use up-to-date virus checking software.