Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # dev >> Persisting Pig Scripts


Copy link to this message
-
Persisting Pig Scripts
Hi All,

What do you guys think about adding a feature to be able to persist the
script (file or cache in case of grunt) on HDFS or locally based on an
admin setting (pig.properties). This will help infrastructure/ops teams
analyze nature of Pig scripts and be able to make certain decisions based
on it (optimizing data storage based on access patterns etc). This is
actually something we want to do but the challenge is there is no central
place where we can track user scripts.

It could be a config param "pig.persist.script=/pig/". The script could be
stored with a configurable name -> ${mapred.job.name}+${user.name}+timestamp"
either on HDFS or local based on the configuration setting.

Thanks,
Prashant
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB