I think the generic hash-join strategy is, for some small set A, we can
send the whole set to partitions of a larger set B and do the join in
parallel. In this case, whichever is the smaller set would be consumed on
some worker, and then distributed out to each worker participating in the
hash join. Outside of Presto, this is often done in an iterator where the
smaller set is an argument to an iterator.
On Mon, Jun 13, 2016 at 4:03 PM, Dylan Hutchison <[EMAIL PROTECTED]