I typically increment a counter and have a bounded log of randomly sampled
On Mar 24, 2012 6:50 PM, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
> Can do a counter and log the first few thousand rows or something ...
> On Mar 24, 2012, at 10:33 AM, Bill Graham <[EMAIL PROTECTED]> wrote:
> > The pattern I use with bad data is to increment a counter and return
> > Logging and error message is also good, but that could turn into a
> > log file if there's a large dataset of bad data. Would be curious to hear
> > others thoughts re the logging bit.
> > Either way, I think this is a good change to make to AvroStorage.
> > On Fri, Mar 23, 2012 at 7:03 PM, Russell Jurney <
> [EMAIL PROTECTED]>wrote:
> >> One record in a 125MB avro file is killing my script. I could patch
> >> AvroStorage() to catch the exception and return null after logging an
> >> - I think. Should I?
> >> --
> >> Russell Jurney twitter.com/rjurney [EMAIL PROTECTED]
> >> datasyndrome.com
> > --
> > *Note that I'm no longer using my Yahoo! email address. Please email me
> > [EMAIL PROTECTED] going forward.*