Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig, mail # user - RE: Intermittent NullPointerException


+
Malcolm Tye 2012-11-14, 12:32
+
Cheolsoo Park 2012-11-15, 00:16
+
Malcolm Tye 2012-11-19, 20:30
Copy link to this message
-
Re: Intermittent NullPointerException
Cheolsoo Park 2012-11-20, 05:19
Hi Malcolm,

Thank you for sharing it. I am glad to hear that it worked. :-)

>> We're only processing ~200 rows at the most when we run the script, not
sure if that helps you narrow down the cause.

Very interesting. That's surprisingly small. In my test, I used 10m rows of
random integers as input. I am wondering whether it's your data that
triggers a race condition. Hard to tell. But what's interesting is that the
FindBugs identifies the static field in question as a potential bug, so I
filed PIG-3050 to fix it.

>> I assume we just use the patch you gave me on 0.10.0 until the fix comes
out in a later release ?

Yes. It's a bit too late to get the fix in 0.11 now, but I will aim to fix
it in 0.12.

Regards,
Cheolsoo

p.s. I did more testing with my patch by myself and found some regressions
in streaming. If you're not using streaming, you should be fine, but I am
just letting you know.
On Mon, Nov 19, 2012 at 12:30 PM, Malcolm Tye <[EMAIL PROTECTED]>wrote:

> Hi Cheolsoo,
>                 The patch works as expected. We've not seen one error in
> the
> test system since we installed the new jar file.
>
> We're only processing ~200 rows at the most when we run the script, not
> sure
> if that helps you narrow down the cause.
>
> I assume we just use the patch you gave me on 0.10.0 until the fix comes
> out
> in a later release ?
>
> Many thanks for your quick response, it's very much appreciated.
>
>
> Malc
>
> -----Original Message-----
> From: Cheolsoo Park [mailto:[EMAIL PROTECTED]]
> Sent: 15 November 2012 00:16
> To: [EMAIL PROTECTED]
> Subject: Re: Intermittent NullPointerException
>
> Hi Malcolm,
>
> I have been running your script with 10M rows for a half day but couldn't
> reproduce your error. So my analysis may be baseless here.
>
> That being said, it looks like a race condition to me. The callstack in the
> log shows below:
>
> Caused by: java.lang.NullPointerException
>         at
>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator
> .processInput(PhysicalOperator.java:286)
>
> Now if you look at PhysicalOperator.java:286, it's like this:
>
> if(reporter!=null) {
>     reporter.progress(); ---> NullPointerException is thrown here }
>
> So 'reporter' became null between 'if(reporter!=null)' and
> 'reporter.progress()'.
>
> Given that 'reporter' is a static field, this is totally possible.
>
> public static PigProgressable reporter;
>
> Even though you're setting default_parallel to 1, it only controls the
> number of reducers, and the number of mappers is determined by the size of
> input data. So you will still run multiple mapper threads in parallel in
> LocalJobRunner, and they might be stepping into each other.
>
> One possible fix is probably changing reporter to a thread local variable.
> I will send a patch that does this to your email address. I based it to
> branch-0.10, so you should be able to apply it cleanly to the 0.10 source
> tarball running:
>
> patch -p0 -i <patch file>
>
> Can you please try to apply the patch, rebuild pig and see if that fixes
> your problem? If this does, I will try to write a unit test case and commit
> the fix upstream as well.
>
> Thanks,
> Cheolsoo
>
> On Wed, Nov 14, 2012 at 4:32 AM, Malcolm Tye
> <[EMAIL PROTECTED]>wrote:
>
> > Hi,
> >         Looks like zip files get rejected. Here's the log file
> > unzipped
> >
> >
> > Malc
> >
> >
> > -----Original Message-----
> > From: Malcolm Tye [mailto:[EMAIL PROTECTED]]
> > Sent: 14 November 2012 12:01
> > To: '[EMAIL PROTECTED]'
> > Subject: RE: Intermittent NullPointerException
> >
> > Hi Cheolsoo,
> >                 Even with the recompiled Pig, we still see the error.
> > He's a debug log from Pig. It doesn't seem to give any more
> > information.
> >
> > Any ideas ?
> >
> >
> > Thanks
> >
> > Malc
> >
> >
> > -----Original Message-----
> > From: Malcolm Tye [mailto:[EMAIL PROTECTED]]
> > Sent: 13 November 2012 12:58
> > To: '[EMAIL PROTECTED]'
+
Malcolm Tye 2012-11-21, 11:58
+
Malcolm Tye 2012-11-14, 12:01
+
Malcolm Tye 2012-11-12, 15:14
+
Cheolsoo Park 2012-11-12, 16:29
+
Malcolm Tye 2012-11-12, 22:46
+
Cheolsoo Park 2012-11-12, 22:59
+
Malcolm Tye 2012-11-13, 12:57