Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # dev >> Review Request 16440: HIVE-6098: Merge Tez branch into trunk


Copy link to this message
-
Review Request 16440: HIVE-6098: Merge Tez branch into trunk

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16440/
-----------------------------------------------------------

Review request for hive.
Repository: hive
Description
-------

I think the Tez branch is at a point where we can consider merging it back into trunk after review.
Tez itself has had its first release, most hive features are available on Tez and the test coverage is decent. There are a few known limitations, all of which can be handled in trunk as far as I can tell (i.e.: None of them are large disruptive changes that still require a branch.)
Limitations:
Union all is not yet supported on Tez
SMB is not yet supported on Tez
Bucketed map-join is executed as broadcast join (bucketing is ignored)
Since the user is free to toggle hive.optimize.tez, it's obviously possible to just run these on MR.
I am hoping to follow the approach that was taken with vectorization and shoot for a merge instead of single commit. This would retain history of the branch. Also in vectorization we required at least three +1s before merge, I'm hoping to go with that as well.
I will add a combined patch to this ticket for review purposes (not for commit). I'll also attach instructions to run on a cluster if anyone wants to try.
Diffs
-----
Diff: https://reviews.apache.org/r/16440/diff/
Testing
-------

Testing has been done as a combination of unit tests and q file tests. Unit tests have been created for new classes where possible, new q file tests cover the delta. I've also gone through all .q file tests and picked the relevant ones to run on MiniTezCliDriver.
Thanks,

Gunther Hagleitner