Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # dev >> writing info to configs


Copy link to this message
-
writing info to configs
Hi,

Pig internally generates a lot of runtime info that would be useful to have
output in the job config for later debugging analysis. Should we develop a
standard for params are written to the job conf file, but not accepted as
job input?

For example, these params are accepted as input:
pig.exec.reducers.max
pig.exec.reducers.bytes.per.reducer
But these (not-yet-supported) params are not, they would just be produced:

pig.info.reducers.requested.parallel
pig.info.reducers.estimated.parallel
pig.info.reducers.runtime.parallel
I'm proposing the 'pig.info' prefix for this. We can even cause error
messaging to return if someone tries to set these. See this comment for
more context:

https://issues.apache.org/jira/browse/PIG-2779?focusedCommentId=13422680&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13422680

Thoughts?
thanks,
Bill
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB