Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Working with date converter

Copy link to this message
Re: Working with date converter
sorry, I didn't understand completely

do you want to read a line, if the date is invalid (performing a
IsoToUnix directly and not a regex before) you want to skip it ? it
that ?
if yes, you can replace the field with your converted date (unix
format), and if it fails put a null or nothing

I mean, in your overridden putNext, you have you individual columns,
you can try to convert the date in there and put in the output your
unix date.

sorry if I misunderstood again your problem

On 11/8/11, Rauan Maemirov <[EMAIL PROTECTED]> wrote:
> Sure, but now I'm just omiting the rows _after_ regex matching.
> What I want to do is to avoid additional filtering by regex and ignore
> invalid rows right after unsuccessful IsoToUnix().
> 2011/11/8 pablomar <[EMAIL PROTECTED]>
>> can you write something else (a null, for example) in your putNext
>> method for that field when the date is invalid ?
>> On 11/8/11, Rauan Maemirov <[EMAIL PROTECTED]> wrote:
>> > Well, I solved this issue via regex matching, but I wonder if it's too
>> > costful.
>> > Is there anyway the way to ignore exceptions and move on just by omiting
>> > the wrong tuples?
>> >
>> > 2011/11/8 Rauan Maemirov <[EMAIL PROTECTED]>
>> >
>> >> Hi, all. I've got custom log (csv delimited by comma) with iso dates,
>> >> sometimes log writing lags and I'm having exceptions with wrong iso
>> >> date
>> >> format.
>> >> Here's exception: https://gist.github.com/1347406. (Date is the last
>> >> "parameter" in the row, and it's incorrectly overwritten at the end by
>> >> another string).
>> >>
>> >> The question is how can I filter out all wrong dates or at least force
>> pig
>> >> to ignore them instead of failing?
>> >>
>> >