Babe Ruth 2013-04-30, 19:03
Jie Li 2013-05-02, 04:38
you can add the buckets to a paritions no problems with that.
But to have a bucketed map join what you need is, both the tables need to
bucketed and they need to be in the multiplication factor of each other
like if you have X number of buckets on table A then table B will need NX
number of partitions where N >= 1
there is no condition on partition keys for join condition. Hive only
supports equi joins so its always good idea to have table partitioned on
same column so that you don't have to scan the entire table to match the
column values and you can restrict the data to table in where condition
On Thu, May 2, 2013 at 10:08 AM, Jie Li <[EMAIL PROTECTED]> wrote:
> I tried this interesting idea but also felt a little confusing.
> I guess you'll need to change the table schema so that it has both buckets
> and partitions.
> And to take advantage of the buckets inside the partitions, for example
> using the bucket map join, you'll need to specify one particular partition
> of the table. Seems HIVE-3171 has fixed this problem, but I'm still not
> very clear how two partitioned tables can be joined using bucket map join?
> Do they need the same partition keys and bucket keys, and then Hive will do
> partition-wise join as well as bucket-wise join?
> On Tue, Apr 30, 2013 at 12:03 PM, Babe Ruth <[EMAIL PROTECTED]>wrote:
>> I have a table that is already created and is partitioned dynamically by
>> day. i would like all future partitions to be bucketed on two columns.
>> Can I add a bucket to a partitions in an already existing table?