Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - [DISCUSSION]Pig releases with different versions of Hadoop


Copy link to this message
-
Re: [DISCUSSION]Pig releases with different versions of Hadoop
Alejandro Abdelnur 2011-11-07, 19:40
Hi Olga,

Regarding #1, does this means we'd have a build of Pig X for each
version of Hadoop we support? It seems to me this would be a bit
complex to maintain.

Regarding #2, If Hadoop does a good job at maintaing public API
backwards compatibility and Pig uses only Hadoop public API we would
be good.

Regarding #3, still I can see potential issues (from my experience
with Hadoop-Oozie) where the API did not change but the behavior dir.
This means we'll have to be able to if/then/else within Pig whenever
necessary based on the version of Hadoop.

A possible way of addressing this would be:

* Pig should use the 'hadoop' to run Pig (this would help to cleanly
bring into the classpath the Hadoop depedencies).
* Pig could have a whitelist of Hadoop version it supports and fail if
the current hadoop version is not supported (we could use version
regex/ranges)
* (what I'm suggesting in #3 above) Pig could use the Hadoop version
as a code selector whenever necessary.

Thanks.

Alejandro

On Mon, Nov 7, 2011 at 11:15 AM, Olga Natkovich <[EMAIL PROTECTED]> wrote:
> Hi,
>
> In the past we have for the most part avoided supporting multiple versions of Hadoop with the same version of Pig. This is about to change with release of Hadoop 23. We need to come up with a strategy on how to support that. There are a couple of issues to consider:
>
>
> (1)    Version numbering. Seems like encoding the information in the last version number makes sense. The details of the encoding need to be hashed out
>
> (2)    Code changes required to support different version of Hadoop. This time around we made an effort to make sure that the same code can work with both. In the future that might not work and we would need to figure out how to maintain different code base. Most likely we would have to have additional branches off of main release branch
>
> (3)    Anything else we need to consider?
>
> Olga
>