Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> pig not respecting parameters in mapred-site.xml


Copy link to this message
-
Re: pig not respecting parameters in mapred-site.xml
PIG-3135 that you referred to earlier lets you do that, but that's a part
of 0.12/trunk. You should be able to apply that patch and rebuild pig to be
able to use it.

Here are the notes on how you can use this feature

########### Override hadoop configs programatically #################

# By default, Pig expects hadoop configs (hadoop-site.xml and core-site.xml)
# to be present on the classpath. There are cases when these configs are
# needed to be passed programatically, such as while using the PigServer
API.
# In such cases, you can override hadoop configs by setting the property
# "pig.use.overriden.hadoop.configs".
#
# When this property is set to true, Pig ignores looking for hadoop configs
# in the classpath and instead picks it up from Properties/Configuration
# object passed to it.

# pig.use.overriden.hadoop.configs=true
#
######################################################################
On Tue, Aug 6, 2013 at 7:44 PM, Suhas Satish <[EMAIL PROTECTED]> wrote:

> The parameter values in default file  are all marked as public static
> *final.
> *That explains why they were not being over-ridden by *site.xml
>
>
> Cheers,
> Suhas.
>
>
> On Tue, Aug 6, 2013 at 5:32 PM, Suhas Satish <[EMAIL PROTECTED]>
> wrote:
>
> > None of the parameters in mapred-site.xml are respected. they're being
> > over-ridden by default configurations in  the following file -
> > hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java
> >
> >   public String get(String name) {
> >     String val = getProps().getProperty(name);
> >     if (val == null) {
> >       val = CustomConf.getDefault(name);
> >     }
> >     return substituteVars(val);
> >   }
> >
> > Running via java APIs, not via grunt shell.
> > mapred-site.xml exists on the classpath.
> >
> >
> > If I have to make a change to add resource mapred-site.xml  in apache
> > pig's   org/apache/pig/backend/hadoop/executionengine/
> > HExecutionEngine.java
> > the following changes aren't enough -
> >
> > *    private static final String MAPRED_SITE = "mapred-site.xml";*
> > *    jc.addResource(MAPRED_SITE);
> > *
> >     recomputeProperties(jc, properties);
> >
> >
> > My question is, how should I add a new configuration resource file
> > myconfig-site.xml  to pig and get pig to use it without making changes to
> > hadoop layer (Configuration.java or JobConf.java)?
> >
> >
> >
> > Cheers,
> > Suhas.
> >
> >
> > On Tue, Aug 6, 2013 at 1:32 PM, Prashant Kommireddi <[EMAIL PROTECTED]
> >wrote:
> >
> >> Can you tell us how exactly you are running the pig script? Is your
> >> mapred-site.xml on the classpath? Are you trying to run this via grunt
> or
> >> Java APIs?
> >>
> >>
> >> On Tue, Aug 6, 2013 at 1:16 PM, Suhas Satish <[EMAIL PROTECTED]>
> >> wrote:
> >>
> >> > I am running pig on a custom hadoop implementation but it doesnt
> respect
> >> > params in mapred-site.xml.
> >> >
> >> > Looking into the code, I find that the following 2 files are slightly
> >> > different from stock hadoop in that some patches are not present.
> >> >
> >> > hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java
> >> > and
> >> > src/mapred/org/apache/hadoop/mapred/JobConf.java
> >> >
> >> > Given the constraint that I cannot modify these files, what change
> >> should I
> >> > make within pig to recognize mapred-site.xml parameters?
> >> >
> >> > I pulled in PIG-3135 and PIG-3145 which make changes to
> >> >  org/apache/pig/backend/hadoop/executionengine/HExecutionEngine.java
> >> >
> >> > But the params in mapred-site.xml are still not getting recognized.
> Upon
> >> > remote eclipse  debugging with breakpoints in the file above, this is
> >> what
> >> > I found -
> >> >
> >> > HExecutionEngine.java  - jc = new  jobConf()
> >> > calls
> >> > 1st call upon JobConf() constructor -
> >> > Configuration.get(String)
> >> >  Configuration.getProps() --> if properties ==null, properties = new
> >> > Properties(); loadResources(properties, resources...);
> >> >
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB