Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - What should storefuncs do on parse errors while reading?


Copy link to this message
-
Re: What should storefuncs do on parse errors while reading?
Bill Graham 2012-03-24, 17:33
The pattern I use with bad data is to increment a counter and return null.
Logging and error message is also good, but that could turn into a massive
log file if there's a large dataset of bad data. Would be curious to hear
others thoughts re the logging bit.

Either way, I think this is a good change to make to AvroStorage.

On Fri, Mar 23, 2012 at 7:03 PM, Russell Jurney <[EMAIL PROTECTED]>wrote:

> One record in a 125MB avro file is killing my script.  I could patch
> AvroStorage() to catch the exception and return null after logging an error
> - I think.  Should I?
>
> --
> Russell Jurney twitter.com/rjurney [EMAIL PROTECTED]
> datasyndrome.com
>

--
*Note that I'm no longer using my Yahoo! email address. Please email me at
[EMAIL PROTECTED] going forward.*