Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - ON ERROR


Copy link to this message
-
Re: ON ERROR
Russell Jurney 2013-12-21, 05:55
So, to give this a little more detail - Pig currently will fail a 1PB map
reduce if one record is malformed. In most use cases, that is insane
behavior. The ON ERROR proposal lets you handle errors in a reasonable
manner: specify thresholds to fail at, and split errant records off into
another relation to study later.

On Friday, December 20, 2013, Russell Jurney wrote:

> http://wiki.apache.org/pig/PigErrorHandlingInScripts
> https://issues.apache.org/jira/plugins/servlet/mobile#issue/PIG-2620
>
> On Friday, December 20, 2013, Ruslan Al-Fakikh wrote:
>
>> Hi Russell,
>>
>> Could you be more specific. What would this operator do?
>> Does it have something to do with control logic? (Like IF/ELSE, WHILE,
>> etc)
>> AFAIK, those are not present in Pig because it would make Pig less clean.
>>
>> Thanks
>>
>>
>> On Sat, Dec 21, 2013 at 12:31 AM, Russell Jurney
>> <[EMAIL PROTECTED]>wrote:
>>
>> > Does anyone think ON ERROR will ever get built into Pig? Would be so
>> cool,
>> > put pig above all other data flow tools in sophistication for large ETL.
>> >
>> > I would work on that, if someone would pay me to do it.
>> >
>> >
>> > --
>> > Russell Jurney twitter.com/rjurney [EMAIL PROTECTED]
>> > datasyndrome.com
>> >
>>
>
>
> --
> Russell Jurney twitter.com/rjurney [EMAIL PROTECTED]<javascript:_e({}, 'cvml', '[EMAIL PROTECTED]');>
>  datasyndrome.com
>
--
Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] datasyndrome.com