So, to give this a little more detail - Pig currently will fail a 1PB map
reduce if one record is malformed. In most use cases, that is insane
behavior. The ON ERROR proposal lets you handle errors in a reasonable
manner: specify thresholds to fail at, and split errant records off into
another relation to study later.
On Friday, December 20, 2013, Russell Jurney wrote:
> On Friday, December 20, 2013, Ruslan Al-Fakikh wrote:
>> Hi Russell,
>> Could you be more specific. What would this operator do?
>> Does it have something to do with control logic? (Like IF/ELSE, WHILE,
>> AFAIK, those are not present in Pig because it would make Pig less clean.
>> On Sat, Dec 21, 2013 at 12:31 AM, Russell Jurney
>> <[EMAIL PROTECTED]>wrote:
>> > Does anyone think ON ERROR will ever get built into Pig? Would be so
>> > put pig above all other data flow tools in sophistication for large ETL.
>> > I would work on that, if someone would pay me to do it.
>> > --
>> > Russell Jurney twitter.com/rjurney [EMAIL PROTECTED]
>> > datasyndrome.com
Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] datasyndrome.com