Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # dev - Lisk Bucketing DDL Patch

Copy link to this message
Re: Lisk Bucketing DDL Patch
Namit Jain 2012-07-28, 04:09
Yes, that patch will become quiet big to be done a single shot.

Moreover, the skew information can be used by a variety of use-cases.

1. List Bucketing
2. Skew Joins: https://cwiki.apache.org/Hive/skewed-join-optimization.html
3. Another variant of skew joins:

So, the skew may not be limited to be used for list bucketing only.

So, it might be simpler to split into DDL and DML support.

DDL will be common to all the use-cases who want to use/store skew

Each use-case can implement the DML/Query separately.
On 7/28/12 7:07 AM, "Carl Steinbach" <[EMAIL PROTECTED]> wrote:

>> Since we are close to release the first patch DDL.
>In a comment on the design doc you said that the first phase would involve
>implementing this feature for a single-column end-to-end (DML+DDL). Has
>that plan changed?
>On Wed, Jul 25, 2012 at 12:31 AM, Gang Tim Liu <[EMAIL PROTECTED]> wrote:
>> Dear all hive developers,
>> Please review the documentation:
>> https://cwiki.apache.org/confluence/display/Hive/ListBucketing
>> Since we are close to release the first patch DDL.
>> We will continue to update the wiki about new information and in the
>> meanwhile want to collect your feedback.
>> Thanks
>> Tim