Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # dev >> Proposal to create a branch for contrib project Zebra

Copy link to this message
Re: Proposal to create a branch for contrib project Zebra

The reason for a branch is purely based on fair number of improvements
we are planning for Zebra and our desire to have a stable Zebra
implementation for users to use along with PIG on Hadoop-0.20.

New features planned (jiras will be filed soon) :
    * Column security (different permissions for different columns)
    * Ability to drop columns
    * ability to address "column groups" by name
    * Support for sorted tables, map side joins,
    * ...

Many of these changes involve changes to table metadata, schema syntax,
  and on disk format of the metadata (all of these will be backward

If Zebra was a project of its own, one would have made a 0.1.0 branch
and worked on new features in the trunk. The new proposed branch is for
achieving the same by keeping PIG and stable Zebra together. PIG branch
0.4.0 will be made when it is appropriate for PIG. Generally, a contrib
project should not influence that decision.

Is there an alternative to creating a branch? Would you prefer we commit
new features to a line that is being used by users?


Milind A Bhandarkar wrote:
> IANAC, but my (non-binding) vote is also -1. I think all the improvements
> and feature addition to zebra should be available through pig trunk. The
> codebase is not big enough to justify creating a branch. If the reason is
> Pig's dependence on a checked in hadoop jar, the shims proposal by Dmitry
> should be taken up asap, so that those who want to use zebra can use pig
> trunk with hadoop 0.20
> - milind
> On 8/17/09 5:14 PM, "Yiping Han" <[EMAIL PROTECTED]> wrote:
>> +1
>> On 8/18/09 7:11 AM, "Olga Natkovich" <[EMAIL PROTECTED]> wrote:
>>> +1
>>> -----Original Message-----
>>> From: Raghu Angadi [mailto:[EMAIL PROTECTED]]
>>> Sent: Monday, August 17, 2009 4:06 PM
>>> Subject: Proposal to create a branch for contrib project Zebra
>>> Thanks to the PIG team, The first version of contrib project Zebra
>>> (PIG-833) is committed to PIG trunk.
>>> In short, Zebra is a table storage layer built for use in PIG and other
>>> Hadoop applications.
>>> While we are stabilizing current version V1 in the trunk, we plan to add
>>> more new features to it. We would like to create an svn branch for the
>>> new features. We will be responsible for managing zebra in PIG trunk and
>>> in the new branch. We will merge the branch when it is ready. We expect
>>> the changes to affect only 'contrib/zebra' directory.
>>> As a regular contributor to Hadoop, I will be the initial committer for
>>> Zebra. As more patches are contributed by other Zebra developers, there
>>> might be more commiters added through normal Hadoop/Apache procedure.
>>> I would like to create a branch called 'zebra-v2' with approval from PIG
>>> team.
>>> Thanks,
>>> Raghu.
>> ----------------------
>> Yiping Han
>> F-3140
>> (408)349-4403