Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Can you unset a mapred.input.dir configuration value?


Copy link to this message
-
Re: Can you unset a mapred.input.dir configuration value?
You can use  FileInptuFormat.setInputPaths(configuration,
job1-output). This will overwrite the old input path(s).

-Joey

On Mon, Jan 16, 2012 at 7:16 PM, W.P. McNeill <[EMAIL PROTECTED]> wrote:
>
> It is possible to unset a configuration value? I think the answer is no,
> but I want to be sure.
>
> I know that you can set a configuration value to the empty string, but I
> have a scenario in which that is not an option. I have a top level Hadoop
> Tool that launches a series of other Hadoop jobs in its run() method. The
> output of the first sub-job becomes the input of the second one and so on.
> The top-level Tool takes a configuration file which specifies parameters
> used by all the sub-jobs. It also specifies a mapred.input.dir value which
> serves as the input directory to the first sub-job.
>
> TopLevelJob() {
>  job1 = createJob1(configuration);
>  // Run job 1
>  job2 = createJob2();
>  FileInputFormat.addInputPath(configuration, job1-output)
>  // Run job 2
> }
>
> The problem is that addInputPath() appends a value to the end of
> mapred.input.dir, erroneously leaving the input directory for Job 1 on the
> list for Job 2. If I try to delete Job 1's input dir by setting
> mapred.input.dir to the empty string like so:
>
> configuration.set("mapred.input.dir", "")
>
> the addInputPath() method appends the input path, giving the value
> ",job1-output". The first element of this list is the empty string, which
> causes an Exception.
>
> I can work around this by calling configuration.set("mapred.input.dir")
> directly when creating Job 2, but this feels like a hack. It seems like the
> proper way to set input paths is via a FileInputFormat method instead of by
> setting the property directly.
--
Joseph Echeverria
Cloudera, Inc.
443.305.9434
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB