Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> long parse time


Copy link to this message
-
Re: long parse time
What version of Pig are you using?  Unreasonably long parse times were in issue in Pig 0.9 and 0.10, I believe those issues were fixed in Pig 0.11.

Alan.

On Mar 28, 2013, at 12:51 PM, Patrick Salami wrote:

> We have some very long pig scripts that run several times per day. We
> believe that the script parsing process takes very long (about 1h). During
> this time, the pig command just hangs before any output is displayed (I am
> assuming this is the parsing phase). My question is, can this process be
> optimized by somehow serializing the intermediate parsed script to disk
> after the parsing phase is complete so that we don't have to go through the
> parsing process each time the script is run (so long as the script itself
> does not change)? That way, we could then load and run the parsed
> representation of the script rather than re-parsing it for each run. Since
> this is probably not a readily-available feature, could someone please
> point me to the right place in the code where this intermediate output can
> be intercepted?
>
> Thanks!
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB