Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop, mail # general - [VOTE] Release hadoop-2.0.0-alpha


Copy link to this message
-
Re: [VOTE] Release hadoop-2.0.0-alpha
Andrew Purtell 2012-05-10, 18:30
> Let's take one basic case, how does one find the address of the job
> tracker in a version agnostic way?

Pardon, I should also have included the other half of that particular
problem: How does one set the address of the job tracker in a version
agnostic way?

    Configuration conf = HBaseConfiguration.create(); // adds HBase
default resources
    Configuration jobConf = miniPublicMRCluster.getConfig();
    Configuration merged intelligentlyMergeInAVersionAgnosticWaySoAJobWillRunSuccessfully(conf,
jobConf); // :-)
    JobConf job = new JobConf(conf);
    ....

Best regards,

    - Andy
On Thu, May 10, 2012 at 11:23 AM, Andrew Purtell <[EMAIL PROTECTED]> wrote:
> Hi Todd,
>
>> Have you seen the new MiniMRClientCluster class? It's meant to be what
>> you describe - a minicluster which only exposes "external" APIs --
>> most importantly a way of getting at a JobClient to submit jobs. We
>> have it implemented in both 1.x and 2.x at this point, though I don't
>> recall if it's in the 1.0.x releases or if it's only slated for 1.1+
>
> Do you mean the below?
>
>    /*
>     * A simple interface for a client MR cluster used for testing.
> This interface
>     * provides basic methods which are independent of the underlying
> Mini Cluster (
>     * either through MR1 or MR2).
>     */
>    public interface MiniMRClientCluster {
>      public void start() throws IOException;
>      public void stop() throws IOException;
>      public Configuration getConfig() throws IOException;
>    }
>
> This doesn't sufficiently encapsulate the mini MR cluster for the
> purposes of a test rig. The issues we've seen are variations in what
> configuration variables are required: their names, and their
> semantics, for finding information about how the cluster is set up.
> Let's take one basic case, how does one find the address of the job
> tracker in a version agnostic way? For example, perhaps:
>
>    public InetSocketAddress getJobTrackerAddress();
>
> or at a higher level of abstraction:
>
>    public JobTrackerInfo getJobTracker();
>
>    public TaskTrackerInfo[] getTaskTrackers();
>
> and, since this a test rig, we'd like to terminate, perhaps abruptly,
> a task tracker, or launch replacements, or launch new ones.
>
>    public boolean stopTaskTracker(TaskTrackerInfo tracker, boolean force);
>
>    public TaskTrackerInfo startTaskTracker(... /* some universal
> public parameters TBD */);
>
> And, likewise for HDFS,
>
>    public interface MiniHDFSClientCluster {
>      public void start() throws IOException;
>      public void stop() throws IOException;
>      public Configuration getConfig() throws IOException;
>      public NameNodeInfo[] getNameNodes();
>      public DataNodeInfo[] getDataNodes();
>      public DataNodeInfo startDataNode(...);
>      public boolean stopDataNode(DataNodeInfo dn, boolean force);
>      // Convenience method for getting the filesystem for the cluster
>      // This needs some thought, because we have FileSystem in 1.x
> and FileContext in 2.x
>      // Here we will use a hypothetical wrapper that uses reflection as needed
>      public FileContext getFileContext();
>    }
>
> and, perhaps additionally a convenience method for corrupting blocks:
>
>    public void writeBlock(Block block, byte[] data, long offset,
> boolean updateChecksum) throws IOException;
>
> and so on.
>
> Best regards,
>
>    - Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet
> Hein (via Tom White)

--
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet
Hein (via Tom White)