Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # general >> Re: MiniMRCluster usage in dependent projects


Copy link to this message
-
Re: MiniMRCluster usage in dependent projects
[changing thread name to not hijack the vote thread]

On Thu, May 10, 2012 at 11:23 AM, Andrew Purtell <[EMAIL PROTECTED]> wrote:
> Hi Todd,
>
>> Have you seen the new MiniMRClientCluster class? It's meant to be what
>> you describe - a minicluster which only exposes "external" APIs --
>> most importantly a way of getting at a JobClient to submit jobs. We
>> have it implemented in both 1.x and 2.x at this point, though I don't
>> recall if it's in the 1.0.x releases or if it's only slated for 1.1+
>
> Do you mean the below?
>
>    /*
>     * A simple interface for a client MR cluster used for testing.
> This interface
>     * provides basic methods which are independent of the underlying
> Mini Cluster (
>     * either through MR1 or MR2).
>     */
>    public interface MiniMRClientCluster {
>      public void start() throws IOException;
>      public void stop() throws IOException;
>      public Configuration getConfig() throws IOException;
>    }
>
> This doesn't sufficiently encapsulate the mini MR cluster for the
> purposes of a test rig. The issues we've seen are variations in what
> configuration variables are required: their names, and their
> semantics, for finding information about how the cluster is set up.
> Let's take one basic case, how does one find the address of the job
> tracker in a version agnostic way? For example, perhaps:
>
>    public InetSocketAddress getJobTrackerAddress();

The issue is that MR2 doesn't have a JobTracker address. Neither does
it have TaskTrackers. So there is no real way to expose this.

I don't see any reason that HBase should need to get these things --
so long as it can get a Configuration, it should be able to submit
jobs.

>
> or at a higher level of abstraction:
>
>    public JobTrackerInfo getJobTracker();
>
>    public TaskTrackerInfo[] getTaskTrackers();
>
> and, since this a test rig, we'd like to terminate, perhaps abruptly,
> a task tracker, or launch replacements, or launch new ones.
>
>    public boolean stopTaskTracker(TaskTrackerInfo tracker, boolean force);
>
>    public TaskTrackerInfo startTaskTracker(... /* some universal
> public parameters TBD */);

The above should only be useful for system-testing MR itself. But for
dependent projects (eg HBase/Hive/etc) what's the use case?

-Todd
--
Todd Lipcon
Software Engineer, Cloudera
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB