Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # dev >> Persisting Pig Scripts

Copy link to this message
Persisting Pig Scripts
Hi All,

What do you guys think about adding a feature to be able to persist the
script (file or cache in case of grunt) on HDFS or locally based on an
admin setting (pig.properties). This will help infrastructure/ops teams
analyze nature of Pig scripts and be able to make certain decisions based
on it (optimizing data storage based on access patterns etc). This is
actually something we want to do but the challenge is there is no central
place where we can track user scripts.

It could be a config param "pig.persist.script=/pig/". The script could be
stored with a configurable name -> ${mapred.job.name}+${user.name}+timestamp"
either on HDFS or local based on the configuration setting.