|
Russell Jurney
2012-02-05, 03:11
Daniel Dai
2012-02-06, 05:04
Russell Jurney
2012-02-06, 23:27
Dmitriy Ryaboy
2012-02-07, 01:22
Russell Jurney
2012-02-07, 02:01
Prashant Kommireddi
2012-02-07, 02:06
Russell Jurney
2012-02-07, 03:03
|
-
ONERRORRussell Jurney 2012-02-05, 03:11
Did ONERROR ever get built? I have a few bad datetimes out of many failing
to parse, and I don't want my entire pig script dying because I lost a few rows. http://wiki.apache.org/pig/PigErrorHandlingInScripts -- Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] datasyndrome.com
-
Re: ONERRORDaniel Dai 2012-02-06, 05:04
No, there is no ONERROR handle right now.
Daniel On Sat, Feb 4, 2012 at 7:11 PM, Russell Jurney <[EMAIL PROTECTED]> wrote: > Did ONERROR ever get built? I have a few bad datetimes out of many failing > to parse, and I don't want my entire pig script dying because I lost a few > rows. > > http://wiki.apache.org/pig/PigErrorHandlingInScripts > > -- > Russell Jurney > twitter.com/rjurney > [EMAIL PROTECTED] > datasyndrome.com
-
Re: ONERRORRussell Jurney 2012-02-06, 23:27
I just had to copy CustomFormatToISO and create ForgivingCustomFormatToISO
that does a try/catch/return null, because 0.01% of my records have bad RFC1123 dates in them. This seems very, very wrong. Is there a better way than this at the moment, or is this something that must be addressed with ONERROR? Russ On Sun, Feb 5, 2012 at 9:04 PM, Daniel Dai <[EMAIL PROTECTED]> wrote: > No, there is no ONERROR handle right now. > > Daniel > > On Sat, Feb 4, 2012 at 7:11 PM, Russell Jurney <[EMAIL PROTECTED]> > wrote: > > Did ONERROR ever get built? I have a few bad datetimes out of many > failing > > to parse, and I don't want my entire pig script dying because I lost a > few > > rows. > > > > http://wiki.apache.org/pig/PigErrorHandlingInScripts > > > > -- > > Russell Jurney > > twitter.com/rjurney > > [EMAIL PROTECTED] > > datasyndrome.com > -- Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] datasyndrome.com
-
Re: ONERRORDmitriy Ryaboy 2012-02-07, 01:22
Try / catch / return null seems like the exactly right thing to do.
You will not a lot of string parsing UDFs in piggybank work that way. On Mon, Feb 6, 2012 at 3:27 PM, Russell Jurney <[EMAIL PROTECTED]>wrote: > I just had to copy CustomFormatToISO and create ForgivingCustomFormatToISO > that does a try/catch/return null, because 0.01% of my records have bad > RFC1123 dates in them. This seems very, very wrong. > > Is there a better way than this at the moment, or is this something that > must be addressed with ONERROR? > > Russ > > On Sun, Feb 5, 2012 at 9:04 PM, Daniel Dai <[EMAIL PROTECTED]> wrote: > > > No, there is no ONERROR handle right now. > > > > Daniel > > > > On Sat, Feb 4, 2012 at 7:11 PM, Russell Jurney <[EMAIL PROTECTED] > > > > wrote: > > > Did ONERROR ever get built? I have a few bad datetimes out of many > > failing > > > to parse, and I don't want my entire pig script dying because I lost a > > few > > > rows. > > > > > > http://wiki.apache.org/pig/PigErrorHandlingInScripts > > > > > > -- > > > Russell Jurney > > > twitter.com/rjurney > > > [EMAIL PROTECTED] > > > datasyndrome.com > > > > > > -- > Russell Jurney > twitter.com/rjurney > [EMAIL PROTECTED] > datasyndrome.com >
-
Re: ONERRORRussell Jurney 2012-02-07, 02:01
Is there a way to report the records we null through counters or something?
On Mon, Feb 6, 2012 at 5:22 PM, Dmitriy Ryaboy <[EMAIL PROTECTED]> wrote: > Try / catch / return null seems like the exactly right thing to do. > You will not a lot of string parsing UDFs in piggybank work that way. > > On Mon, Feb 6, 2012 at 3:27 PM, Russell Jurney <[EMAIL PROTECTED] > >wrote: > > > I just had to copy CustomFormatToISO and create > ForgivingCustomFormatToISO > > that does a try/catch/return null, because 0.01% of my records have bad > > RFC1123 dates in them. This seems very, very wrong. > > > > Is there a better way than this at the moment, or is this something that > > must be addressed with ONERROR? > > > > Russ > > > > On Sun, Feb 5, 2012 at 9:04 PM, Daniel Dai <[EMAIL PROTECTED]> > wrote: > > > > > No, there is no ONERROR handle right now. > > > > > > Daniel > > > > > > On Sat, Feb 4, 2012 at 7:11 PM, Russell Jurney < > [EMAIL PROTECTED] > > > > > > wrote: > > > > Did ONERROR ever get built? I have a few bad datetimes out of many > > > failing > > > > to parse, and I don't want my entire pig script dying because I lost > a > > > few > > > > rows. > > > > > > > > http://wiki.apache.org/pig/PigErrorHandlingInScripts > > > > > > > > -- > > > > Russell Jurney > > > > twitter.com/rjurney > > > > [EMAIL PROTECTED] > > > > datasyndrome.com > > > > > > > > > > > -- > > Russell Jurney > > twitter.com/rjurney > > [EMAIL PROTECTED] > > datasyndrome.com > > > -- Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] datasyndrome.com
-
Re: ONERRORPrashant Kommireddi 2012-02-07, 02:06
Russell, you could use PigWarning to report counters
http://pig.apache.org/docs/r0.9.1/api/org/apache/pig/PigWarning.html This should display as counters on the JobTracker, please make sure you have "aggregate.warning" set to true (by default it is true, but just in case) try { //foo bar } catch (IndexOutOfBoundsException ie) { String msg = "Some message"; warn(msg + " --> " + ie.toString(), PigWarning.UDF_WARNING_2); return null; } catch (NullPointerException npe) { warn(npe.toString(), PigWarning.UDF_WARNING_3); return null; } catch (ClassCastException cce) { warn(cce.toString(), PigWarning.UDF_WARNING_4); return null; } On Mon, Feb 6, 2012 at 6:01 PM, Russell Jurney <[EMAIL PROTECTED]>wrote: > Is there a way to report the records we null through counters or something? > > On Mon, Feb 6, 2012 at 5:22 PM, Dmitriy Ryaboy <[EMAIL PROTECTED]> wrote: > > > Try / catch / return null seems like the exactly right thing to do. > > You will not a lot of string parsing UDFs in piggybank work that way. > > > > On Mon, Feb 6, 2012 at 3:27 PM, Russell Jurney <[EMAIL PROTECTED] > > >wrote: > > > > > I just had to copy CustomFormatToISO and create > > ForgivingCustomFormatToISO > > > that does a try/catch/return null, because 0.01% of my records have bad > > > RFC1123 dates in them. This seems very, very wrong. > > > > > > Is there a better way than this at the moment, or is this something > that > > > must be addressed with ONERROR? > > > > > > Russ > > > > > > On Sun, Feb 5, 2012 at 9:04 PM, Daniel Dai <[EMAIL PROTECTED]> > > wrote: > > > > > > > No, there is no ONERROR handle right now. > > > > > > > > Daniel > > > > > > > > On Sat, Feb 4, 2012 at 7:11 PM, Russell Jurney < > > [EMAIL PROTECTED] > > > > > > > > wrote: > > > > > Did ONERROR ever get built? I have a few bad datetimes out of many > > > > failing > > > > > to parse, and I don't want my entire pig script dying because I > lost > > a > > > > few > > > > > rows. > > > > > > > > > > http://wiki.apache.org/pig/PigErrorHandlingInScripts > > > > > > > > > > -- > > > > > Russell Jurney > > > > > twitter.com/rjurney > > > > > [EMAIL PROTECTED] > > > > > datasyndrome.com > > > > > > > > > > > > > > > > -- > > > Russell Jurney > > > twitter.com/rjurney > > > [EMAIL PROTECTED] > > > datasyndrome.com > > > > > > > > > -- > Russell Jurney > twitter.com/rjurney > [EMAIL PROTECTED] > datasyndrome.com >
-
Re: ONERRORRussell Jurney 2012-02-07, 03:03
Thanks, I'll add that to the patch
https://issues.apache.org/jira/browse/PIG-2515 On Mon, Feb 6, 2012 at 6:06 PM, Prashant Kommireddi <[EMAIL PROTECTED]>wrote: > Russell, you could use PigWarning to report counters > http://pig.apache.org/docs/r0.9.1/api/org/apache/pig/PigWarning.html > This should display as counters on the JobTracker, please make sure you > have "aggregate.warning" set to true (by default it is true, but just in > case) > > try { > > //foo bar > > } catch (IndexOutOfBoundsException ie) { > String msg = "Some message"; > warn(msg + " --> " + ie.toString(), PigWarning.UDF_WARNING_2); > return null; > } catch (NullPointerException npe) { > warn(npe.toString(), PigWarning.UDF_WARNING_3); > return null; > } catch (ClassCastException cce) { > warn(cce.toString(), PigWarning.UDF_WARNING_4); > return null; > } > > On Mon, Feb 6, 2012 at 6:01 PM, Russell Jurney <[EMAIL PROTECTED] > >wrote: > > > Is there a way to report the records we null through counters or > something? > > > > On Mon, Feb 6, 2012 at 5:22 PM, Dmitriy Ryaboy <[EMAIL PROTECTED]> > wrote: > > > > > Try / catch / return null seems like the exactly right thing to do. > > > You will not a lot of string parsing UDFs in piggybank work that way. > > > > > > On Mon, Feb 6, 2012 at 3:27 PM, Russell Jurney < > [EMAIL PROTECTED] > > > >wrote: > > > > > > > I just had to copy CustomFormatToISO and create > > > ForgivingCustomFormatToISO > > > > that does a try/catch/return null, because 0.01% of my records have > bad > > > > RFC1123 dates in them. This seems very, very wrong. > > > > > > > > Is there a better way than this at the moment, or is this something > > that > > > > must be addressed with ONERROR? > > > > > > > > Russ > > > > > > > > On Sun, Feb 5, 2012 at 9:04 PM, Daniel Dai <[EMAIL PROTECTED]> > > > wrote: > > > > > > > > > No, there is no ONERROR handle right now. > > > > > > > > > > Daniel > > > > > > > > > > On Sat, Feb 4, 2012 at 7:11 PM, Russell Jurney < > > > [EMAIL PROTECTED] > > > > > > > > > > wrote: > > > > > > Did ONERROR ever get built? I have a few bad datetimes out of > many > > > > > failing > > > > > > to parse, and I don't want my entire pig script dying because I > > lost > > > a > > > > > few > > > > > > rows. > > > > > > > > > > > > http://wiki.apache.org/pig/PigErrorHandlingInScripts > > > > > > > > > > > > -- > > > > > > Russell Jurney > > > > > > twitter.com/rjurney > > > > > > [EMAIL PROTECTED] > > > > > > datasyndrome.com > > > > > > > > > > > > > > > > > > > > > -- > > > > Russell Jurney > > > > twitter.com/rjurney > > > > [EMAIL PROTECTED] > > > > datasyndrome.com > > > > > > > > > > > > > > > -- > > Russell Jurney > > twitter.com/rjurney > > [EMAIL PROTECTED] > > datasyndrome.com > > > -- Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] datasyndrome.com |