In hive if you are running a group by, then all the select columns have to
be in the group by clause. This limitation is for the column definition
only and not for the column operations like count etc
All the columns for group by do go to a single map reduce job and it does
not launch multiple mapreduce jobs for each group by.
I am not sure what do you mean by better way?
On Mon, May 6, 2013 at 11:37 PM, Peter Chu <[EMAIL PROTECTED]> wrote:
> In Hive, I cannot perform a SELECT GROUP BY on fields not in the GROUP BY
> Example: SELECT st.a, st.b, st.c, st.d, FROM some_table st GROUP BY st.a;
> -- This does not work.
> To make it work, I would need to add the other fields in the group by
> Not quite sure but I think each group by will give another M/R job.
> Wondering if there is any other way / better way to do group by.