Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - PigRunner setting replication level


Copy link to this message
-
PigRunner setting replication level
Adam Silberstein 2014-01-29, 00:48
Hi,
I'm having some trouble controlling output replication level with PigRunner.  In particular, when on a single box dev environment I want to set replication to 1.  Else, with replication at 3x, the name node marks all blocks as under replicated and eventually starts freaking out.

Here are some details:
-I set replication to 1 in hdfs-site.xml.  I set all relevant environment variables like HADOOP_CONF_DIR and PIG_HOME
-When I run pig on the command line I get my desired output replication of 1.
-When I run pig through PigRunner I get output replication of 3.
  -I checked on all ENV variables within my process using PigRunner.  They match what I see in the shell.  (Not sure if PigRunner would pick these up anyway).
  -I pass a properties file to PigRunner.  The only relevant property there is 'mapred.submit.replication=1'

My best guess is I'm not passing in the correct properties, but I am not sure.  Thanks in advance for any suggestions here.
Thanks,
Adam