Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # dev - [Discussion] Keep or revert PIG-3419 in trunk?


Copy link to this message
-
[Discussion] Keep or revert PIG-3419 in trunk?
Cheolsoo Park 2013-10-07, 19:05
Hi devs,

This is a follow-up discussion about how to resolve the backward
incompatibility of PIG-3419 (Pluggable execution engine). Per the previous
discussion <http://search-hadoop.com/m/wYz6hz9CoE>, I reverted it in 0.12
but kept it in trunk. As we keep committing more changes into trunk, it
gets harder to back out PIG-3419 cleanly. So I suggest we should make a
decision sooner rather than later.

The crux of the problem is as follows:

PIG-3419 removes all the MR-specific things from JobStats. However,
PigRunner and PigServer returns PigStats that in turn exposes JobStats to
end users. So changing JobStats breaks backward compatibility for
downstream projects such as Oozie. While changing the semantics of JobStats
is acceptable, we must provide a deprecation path so that end users can
upgrade their applications smoothly.

Proposed solutions:

1) Provide backward compatibility in source code: Maybe possible, but no
one has come up with a clean solution. For eg, I failed. :(

2) Publish two jars: We keep PIG-3419 only in tez-branch and publish two
jars (one for old API and one for new API) for a couple of future releases.
If we do this, we're going to revert PIG-3419 in trunk.

What do you think?

Thanks,
Cheolsoo