Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Slow tutorial?


Copy link to this message
-
RE: Slow tutorial?
Also, using bz2 gives error, runs were with uncompressed excite.log.bz2:
excite.log

[amiry@gsgw1011 pigtmp]$ java -cp pig_latest.jar org.apache.pig.Main -x
local script1-local.pig
2008-06-26 20:27:09,708 [main] ERROR org.apache.pig.tools.grunt.Grunt -
java.io.IOException: Unable to store alias null
        at
org.apache.pig.impl.util.WrappedIOException.wrap(WrappedIOException.java
:16)
        at org.apache.pig.PigServer.registerQuery(PigServer.java:296)
        at
org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:457)
        at
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptPar
ser.java:233)
        at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java
:63)
        at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:60)
        at org.apache.pig.Main.main(Main.java:294)
?Pt???4??fd???@Q(/??C?!Ap7??;?+???w]?<=v}k?m??w??[3?{=?Z????????u???????
???6r??v????l??8???????Y??vlwR??P??;??P
8p\?b?????;??}??+??|?[t??}?v>?????y?z?^h=?]??;j>w???<?Z??}?????{?c?{?n>?
????wh>?@(????W???????????m?n?????;ol????|p'{?}?t{???[???>??>???^??
                  ?oxf?)
        at
org.apache.pig.backend.local.executionengine.LocalExecutionEngine.execut
e(LocalExecutionEngine.java:136)
        at
org.apache.pig.backend.local.executionengine.LocalExecutionEngine.execut
e(LocalExecutionEngine.java:27)
        at
org.apache.pig.PigServer.optimizeAndRunQuery(PigServer.java:413)
        at org.apache.pig.PigServer.registerQuery(PigServer.java:293)
        ... 5 more

Amir

-----Original Message-----
From: Amir Youssefi [mailto:[EMAIL PROTECTED]]
Sent: Thursday, June 26, 2008 2:06 PM
To: [EMAIL PROTECTED]; Olga Natkovich
Subject: RE: Slow tutorial?
 I created latest pig.jar, tested defaults/pig.properties with PIG-235.

 Local mode is still running after half an hour and may not finish in
hours.

 3 nodes on Hadoop/mapreduce mode ran in less than 10 min (similar to
old runs we had).

Amir
-----Original Message-----
From: Amir Youssefi [mailto:[EMAIL PROTECTED]]
Sent: Thursday, June 26, 2008 12:30 PM
To: [EMAIL PROTECTED]
Subject: RE: Slow tutorial?

Hi Mark,

 pig.jar that comes with it is old and doesn't have pig.properties.

 Try making a new build (June 26th or later) and make sure you have
these in pig.properties:

#Do not spill temp files smaller than this size (bytes)
pig.spill.size.threshold=5000000
#EXPERIMENT: Activate garbage collection when spilling a file bigger
than this size (bytes) #This should help reduce the number of files
being spilled.
pig.spill.gc.activation.size=40000000

or similar numbers...

Amir

-----Original Message-----
From: Mark Snow [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, June 25, 2008 8:07 PM
To: [EMAIL PROTECTED]
Subject: Slow tutorial?

Hi All,

I downloaded the pig tutorial to give it a whirl, set it up on a hadoop
cluster I've used for a few other tasks (7 nodes, ec2) and went through
the instructions to launch tutorial script1 with the excite bz file on
hdfs. Two things jumped out:

1) Only one mapper launched
2) It's really slow. It's been almost 5 hours and still under 10% of the
mapper is completed

Have I misconfigured something? What's a good benchmark run time for the
tutorial scripts to complete?

      
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB