Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> RE: Intermittent NullPointerException


+
Malcolm Tye 2012-11-14, 12:32
+
Cheolsoo Park 2012-11-15, 00:16
+
Malcolm Tye 2012-11-19, 20:30
+
Cheolsoo Park 2012-11-20, 05:19
+
Malcolm Tye 2012-11-21, 11:58
Copy link to this message
-
RE: Intermittent NullPointerException
Hi Cheolsoo,
Even with the recompiled Pig, we still see the error. He's a
debug log from Pig. It doesn't seem to give any more information.

Any ideas ?
Thanks

Malc
-----Original Message-----
From: Malcolm Tye [mailto:[EMAIL PROTECTED]]
Sent: 13 November 2012 12:58
To: '[EMAIL PROTECTED]'
Subject: RE: Intermittent NullPointerException

Hi Cheolsoo,
I tried setting default_parallel to 1 to rule out parallel
processing, but the problem still happened.

I've recompiled Pig and have put that into the test environment with the
debug option set.

I don't have recreate steps that fail every time. When the problem occurs,
we can run the same script again on the input file  and the file gets
processed OK the next time !

Thanks

Malc
-----Original Message-----
From: Cheolsoo Park [mailto:[EMAIL PROTECTED]]
Sent: 12 November 2012 23:00
To: [EMAIL PROTECTED]
Subject: Re: Intermittent NullPointerException

Hi Malcolm,

If you're not running in parallel, it may be a different issue. But I am
surprised that Pig 0.10 local mode fails Intermittently like you describe
w/o parallelism. You might have discovered a real issue. If you could
provide steps that reproduce the error, that would be great!

>> How do I tell which pig jar file I'm using currently ?

"pig -secretDebugCmd" will show which pig jar file in file system is picked
up. For example, it shows the following output for me:

/usr/bin/hadoop jar /home/cheolsoo/pig-svn/bin/../pig-withouthadoop.jar

Thanks,
Cheolsoo

On Mon, Nov 12, 2012 at 2:46 PM, Malcolm Tye
<[EMAIL PROTECTED]>wrote:

> Hi Cheolsoo,
>                 I'm not specifically setting default_parallel in my
> script anywhere and I see this in the log file :-
>
>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobContro
> lCompi ler - Neither PARALLEL nor default parallelism is set for this
> job. Setting number of reducers to 1
>
> So I guess I'm not using parallel. Is it worth trying to compile Pig
> to use the Hadoop 0.23.x LocalJobRunner ? How do I tell which pig jar
> file I'm using currently ?
>
> Thanks
>
> Malc
>
>
> -----Original Message-----
> From: Cheolsoo Park [mailto:[EMAIL PROTECTED]]
> Sent: 12 November 2012 16:29
> To: [EMAIL PROTECTED]
> Subject: Re: Intermittent NullPointerException
>
> Hi Malcolm,
>
> How do you run your script? Do you run your script in parallel? Hadoop
> 1.0.x LocalJobRunner is not thread-safe, and Pig is by default built
> with Hadoop 1.0.x. I have seen a similar problem before (
> https://issues.apache.org/jira/browse/PIG-2852).
>
> If you're running your script in parallel, one workaround is to use
> Hdoop 0.23.x LocalJobRunner, which is thread-safe. You can do the
following:
> - If you're using the standalone pig.jar, please download the Pig
> source tarball and run "ant clean jar -Dhadoopversion=23" to build
pig.jar.
> - If you're using installed Hadoop with pig-withouthadoop.jar, please
> install Hadoop 0.23.x, download the Pig source tarball, and run "ant
> clean jar-withouthadoop -Dhadoopversion=23" to build
pig-withouthadoop.jar.
>
> Hope this is helpful.
>
> Thanks,
> Cheolsoo
>
> On Mon, Nov 12, 2012 at 7:14 AM, Malcolm Tye
> <[EMAIL PROTECTED]>wrote:
>
> > Hi,****
> >
> >     I'm running Pig 0.10.0 in local mode on some small text files.
> > There is no intention to run it on Hadoop at all. We have a job that
> > runs every 5 minutes and about 3% of the time, the job fails with
> > the error below. It happens at random places within the Pig
> > Script.****
> >
> > ** **
> >
> > 2012-10-19 14:15:37,719 [Thread-15] WARN
> > org.apache.hadoop.mapred.LocalJobRunner - job_local_0004
> > java.lang.NullPointerException
> >         at
> > org.apache.pig.backend.hadoop.executionengine.physicalLayer.Physical
> > Op
> > erator.processInput(PhysicalOperator.java:286)
> >
> >         at
> > org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressi
> > on
> > Operators.POProject.getNext(POProject.java:158)
org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:140)
+
Malcolm Tye 2012-11-12, 15:14
+
Cheolsoo Park 2012-11-12, 16:29
+
Malcolm Tye 2012-11-12, 22:46
+
Cheolsoo Park 2012-11-12, 22:59
+
Malcolm Tye 2012-11-13, 12:57