Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> RE: Intermittent NullPointerException


+
Malcolm Tye 2012-11-14, 12:32
+
Cheolsoo Park 2012-11-15, 00:16
+
Malcolm Tye 2012-11-19, 20:30
+
Cheolsoo Park 2012-11-20, 05:19
Copy link to this message
-
RE: Intermittent NullPointerException
Hi Cheolsoo,
When we encounter the problem, we can reprocess the file
with no problems in a later run.  If you want a sample file I can pick one
up for you if you want ?

OK, we'll use your patch on top of 0.10.0 until we see the bug included onto
the next release.

We're not using streaming.

Many thanks

Malc

-----Original Message-----
From: Cheolsoo Park [mailto:[EMAIL PROTECTED]]
Sent: 20 November 2012 05:19
To: [EMAIL PROTECTED]
Subject: Re: Intermittent NullPointerException

Hi Malcolm,

Thank you for sharing it. I am glad to hear that it worked. :-)

>> We're only processing ~200 rows at the most when we run the script,
>> not
sure if that helps you narrow down the cause.

Very interesting. That's surprisingly small. In my test, I used 10m rows of
random integers as input. I am wondering whether it's your data that
triggers a race condition. Hard to tell. But what's interesting is that the
FindBugs identifies the static field in question as a potential bug, so I
filed PIG-3050 to fix it.

>> I assume we just use the patch you gave me on 0.10.0 until the fix
>> comes
out in a later release ?

Yes. It's a bit too late to get the fix in 0.11 now, but I will aim to fix
it in 0.12.

Regards,
Cheolsoo

p.s. I did more testing with my patch by myself and found some regressions
in streaming. If you're not using streaming, you should be fine, but I am
just letting you know.
On Mon, Nov 19, 2012 at 12:30 PM, Malcolm Tye
<[EMAIL PROTECTED]>wrote:

> Hi Cheolsoo,
>                 The patch works as expected. We've not seen one error
> in the test system since we installed the new jar file.
>
> We're only processing ~200 rows at the most when we run the script,
> not sure if that helps you narrow down the cause.
>
> I assume we just use the patch you gave me on 0.10.0 until the fix
> comes out in a later release ?
>
> Many thanks for your quick response, it's very much appreciated.
>
>
> Malc
>
> -----Original Message-----
> From: Cheolsoo Park [mailto:[EMAIL PROTECTED]]
> Sent: 15 November 2012 00:16
> To: [EMAIL PROTECTED]
> Subject: Re: Intermittent NullPointerException
>
> Hi Malcolm,
>
> I have been running your script with 10M rows for a half day but
> couldn't reproduce your error. So my analysis may be baseless here.
>
> That being said, it looks like a race condition to me. The callstack
> in the log shows below:
>
> Caused by: java.lang.NullPointerException
>         at
>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOp
> erator
> .processInput(PhysicalOperator.java:286)
>
> Now if you look at PhysicalOperator.java:286, it's like this:
>
> if(reporter!=null) {
>     reporter.progress(); ---> NullPointerException is thrown here }
>
> So 'reporter' became null between 'if(reporter!=null)' and
> 'reporter.progress()'.
>
> Given that 'reporter' is a static field, this is totally possible.
>
> public static PigProgressable reporter;
>
> Even though you're setting default_parallel to 1, it only controls the
> number of reducers, and the number of mappers is determined by the
> size of input data. So you will still run multiple mapper threads in
> parallel in LocalJobRunner, and they might be stepping into each other.
>
> One possible fix is probably changing reporter to a thread local variable.
> I will send a patch that does this to your email address. I based it
> to branch-0.10, so you should be able to apply it cleanly to the 0.10
> source tarball running:
>
> patch -p0 -i <patch file>
>
> Can you please try to apply the patch, rebuild pig and see if that
> fixes your problem? If this does, I will try to write a unit test case
> and commit the fix upstream as well.
>
> Thanks,
> Cheolsoo
>
> On Wed, Nov 14, 2012 at 4:32 AM, Malcolm Tye
> <[EMAIL PROTECTED]>wrote:
>
> > Hi,
> >         Looks like zip files get rejected. Here's the log file
> > unzipped
> >
> >
> > Malc
> >
> >
> > -----Original Message-----
> > From: Malcolm Tye [mailto:[EMAIL PROTECTED]]
be great!
+
Malcolm Tye 2012-11-14, 12:01
+
Malcolm Tye 2012-11-12, 15:14
+
Cheolsoo Park 2012-11-12, 16:29
+
Malcolm Tye 2012-11-12, 22:46
+
Cheolsoo Park 2012-11-12, 22:59
+
Malcolm Tye 2012-11-13, 12:57