-bucketing on a column with millions of unique IDs
Echo Li 2013-02-21, 00:19
I plan to bucket a table by "userid" as I'm going to do intense calculation
using "group by userid". there are about 110 million rows, with 7 million
unique userid, so my question is what is a good number of buckets for this
scenario, and how to determine number of buckets?
Any input is apprecaited :)