Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Drill, mail # dev - Broadcast exchange


Copy link to this message
-
Broadcast exchange
Timothy Chen 2013-11-15, 18:04


Sent from my iPhone

Begin forwarded message:

> From: Timothy Chen <[EMAIL PROTECTED]>
> Date: November 15, 2013 at 1:32:26 AM HST
> To: Jacques Nadeau <[EMAIL PROTECTED]>
> Subject: Broadcast exchange
>
> Hey Jacques,
>
> I'm working on the broadcast sender, and I remember we had some chat about the scenario and you mentioned buffering.
>
> Right now I have the most simplest version which I have single broadcast exchange which a single broadcast sender sends every incoming batch to all the random receivers.
>
> Couple questions I have now:
>
> 1) the scenario where I thought broadcast is very useful is when we join two data sources and one of them is a small fact table, so we copy that table to all join nodes for fast joining. However, I don't remember seeing any test if we have multiple scans joining?
>
> The physical plan for that will be two scan nodes, one uses broadcast exchange another perhaps hash to random exchange, and join from the two exchanges that sends to screen.
>
> Does that plan work? I feel like I might be missing something.
>
> 2) my impl doesn't do any buffering and just creates a fragment writable batch for each destination from the incoming batch buffers and sends it in a for loop. I was wondering what is the buffering you thought necessary when you mentioned it earlier?
>
> Btw I'll be soon experience the hangout time pain as the china folks as I am going to be in Taiwan for 2 weeks from this Sunday :)
>
> Tim
>
> Sent from my iPhone