Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # dev - Tez branch and tez based patches

Copy link to this message
Re: Tez branch and tez based patches
Alan Gates 2013-08-05, 17:54

On Jul 29, 2013, at 9:53 PM, Edward Capriolo wrote:

> Also watched http://www.ustream.tv/recorded/36323173
> I definitely see the win in being able to stream inter-stage output.
> I see some cases where small intermediate results can be kept "In memory".
> But I was somewhat under the impression that the map reduce spill settings
> kept stuff in memory, isn't that what spill settings are?

No.  MapReduce always writes shuffle data to local disk.  And intermediate results between MR jobs are always persisted to HDFS, as there's no other option.  When we talk of being able to keep intermediate results in memory we mean getting rid of both of these disk writes/reads when appropriate (meaning not always, there's a trade off between speed and error handling to be made here, see below for more details).

> There is a few bullet points that came up repeatedly that I do not follow:
> Something was said to the effect of "Container reuse makes X faster".
> Hadoop has jvm reuse. Not following what the difference is here? Not
> everyone has a 10K node cluster.

Sharing JVMs across users is inherently insecure (we can't guarantee what code the first user left behind that may interfere with later users).  As I understand container re-use in Tez it constrains the re-use to one user for security reasons, but still avoids additional JVM start up costs.  But this is a question that the Tez guys could answer better on the Tez lists ([EMAIL PROTECTED])

> "Joins in map reduce are hard" Really? I mean some of them are I guess, but
> the typical join is very easy. Just shuffle by the join key. There was not
> really enough low level details here saying why joins are better in tez.

Join is not a natural operation in MapReduce.  MR gives you one input and one output.  You end up having to bend the rules to do have multiple inputs.  The idea here is that Tez can provide operators that naturally work with joins and other operations that don't fit the one input/one output model (eg unions, etc.).

> "Chosing the number of maps and reduces is hard" Really? I do not find it
> that hard, I think there are times when it's not perfect but I do not find
> it hard. The talk did not really offer anything here technical on how tez
> makes this better other then it could make it better.

Perhaps manual would be a better term here than hard.  In our experience it takes quite a bit of engineer trial and error to determine the optimal numbers.  This may be ok if you're going to invest the time once and then run the same query every day for 6 months.  But obviously it doesn't work for the ad hoc case.  Even in the batch case it's not optimal because every once and a while an engineer has to go back and re-optimize the query to deal with changing data sizes, data characteristics, etc.  We want the optimizer to handle this without human intervention.

> The presentations mentioned streaming data, how do two nodes stream data
> between a tasks and how it it reliable? If the sender or receiver dies does
> the entire process have to start again?

If the sender or receiver dies then the query has to be restarted from some previous point where data was persisted to disk.  The idea here is that speed vs error recovery trade offs should be made by the optimizer.  If the optimizer estimates that a query will complete in 5 seconds it can stream everything and if a node fails it just re-runs the whole query.  If it estimates that a particular phase of a query will run for an hour it can choose to persist the results to HDFS so that in the event of a failure downstream the long phase need not be re-run.  Again we want this to be done automatically by the system so the user doesn't need to control this level of detail.

> Again one of the talks implied there is a prototype out there that launches
> hive jobs into tez. I would like to see that, it might answer more
> questions then a power point, and I could profile some common queries.

As mentioned in a previous email afaik Gunther's pushed all these changes to the Tez branch in Hive.