Hi, I'm an engineer at Mortar Data. I was working on some features to
improve macros that I'd like to contribute (we're hoping to build a library
of reusable pig macros implementing common algorithms), but I wanted to
check-in here first to see if anyone has concerns about the changes I'd be
The changes I've implemented are:
1. Macro files can register jars and udfs (avoiding namespace conflicts
is the user's responsibility)
2. Macro files can be be redundantly imported (the extra import
statements will be ignored). The use case is pigscript A imports macro
files A and B, but A also imports B. Pig will emit a warning, but not fail
as it currently does.
3. Registers and imports from S3 aren't repeatedly downloaded as a
pigscript is parsed. I'm not sure why it was doing this in the first place,
but it looked like a query was being assembled line-by-line and every time
it would re-download jars etc.
I was working on our fork of 0.9.2 with modifications, so please let me
know if any of these have already been fixed in the latest version.