Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> hbase.mapred.output.quorum ignored in Mapper job with HDFS source and HBase sink


+
David Koch 2013-03-24, 12:46
+
Ted Yu 2013-03-24, 14:35
+
David Koch 2013-03-26, 15:03
+
Ted Yu 2013-03-27, 17:40
Copy link to this message
-
Re: hbase.mapred.output.quorum ignored in Mapper job with HDFS source and HBase sink
Hello Ted,

I added the example here:
https://issues.apache.org/jira/browse/HBASE-8202#comment-13618389

To highlight the issue I used an HBase table rather than an HDFS as input
source. Otherwise - one could merely set hbase.zookeeper.quorum to the
destination cluster as a workaround and get away with it.

Regads,

/David
On Wed, Mar 27, 2013 at 6:40 PM, Ted Yu <[EMAIL PROTECTED]> wrote:

> I logged HBASE-8202
>
> Feel free to attach your example there.
>
> Cheers
>
> On Tue, Mar 26, 2013 at 8:03 AM, David Koch <[EMAIL PROTECTED]> wrote:
>
> > Hello Ted,
> >
> > Yes, I'll put in a request and add a baseline example to reproduce the
> > issue.
> >
> > Thank you for helping me get to the bottom of this,
> >
> > /David
> >
> >
> > On Sun, Mar 24, 2013 at 3:35 PM, Ted Yu <[EMAIL PROTECTED]> wrote:
> >
> > > Looks like MultiTableOutputFormat doesn't support this use case -
> > > MultiTableOutputFormat doesn't extend TableOutputFormat:
> > >
> > > public class MultiTableOutputFormat
> > > extendsOutputFormat<ImmutableBytesWritable, Mutation> {
> > > Relevant configuration is setup in TableOutputFormat#setConf():
> > >
> > >   public void setConf(Configuration otherConf) {
> > >     this.conf = HBaseConfiguration.create(otherConf);
> > >     String tableName = this.conf.get(OUTPUT_TABLE);
> > >     if(tableName == null || tableName.length() <= 0) {
> > >       throw new IllegalArgumentException("Must specify table name");
> > >     }
> > >     String address = this.conf.get(QUORUM_ADDRESS);
> > >     int zkClientPort = conf.getInt(QUORUM_PORT, 0);
> > >     String serverClass = this.conf.get(REGION_SERVER_CLASS);
> > >     String serverImpl = this.conf.get(REGION_SERVER_IMPL);
> > >     try {
> > >       if (address != null) {
> > >         ZKUtil.applyClusterKeyToConf(this.conf, address);
> > >       }
> > >
> > > Mind filing a JIRA for enhancement ?
> > >
> > > On Sun, Mar 24, 2013 at 5:46 AM, David Koch <[EMAIL PROTECTED]>
> > wrote:
> > >
> > > > Hello,
> > > >
> > > > I want to import a file on HDFS from one cluster A (source) into
> HBase
> > > > tables on a different cluster B (destination) using a Mapper job with
> > an
> > > > HBase sink. Both clusters run HBase.
> > > >
> > > > This setup works fine:
> > > > - Run Mapper job on cluster B (destination)
> > > > - "mapred.input.dir" --> hdfs://<cluster-A>/<path-to-file> (file on
> > > source
> > > > cluster)
> > > > - "hbase.zookeeper.quorum" --> <quorum-hostname-B>
> > > > - "hbase.zookeeper.property.clientPort" --> <quorum-port-B>
> > > >
> > > > I thought it should be possible to run the job on cluster A (source)
> > and
> > > > using "hbase.mapred.output.quorum" to insert into the tables on
> cluster
> > > B.
> > > > This is what the CopyTable utility does. However, the following does
> > not
> > > > work. HBase looks for the destination table(s) on cluster A and NOT
> > > cluster
> > > > B:
> > > > - Run Mapper job on cluster A (source)
> > > > - "mapred.input.dir" --> hdfs://<cluster-A>/<path-to-file> (file is
> > > local)
> > > > - "hbase.zookeeper.quorum" --> <quorum-hostname-A>
> > > > - "hbase.zookeeper.property.clientPort" --> <quorum-port-A>
> > > > - "hbase.mapred.output.quorum" -> <quorum-hostname-B>:2181:/hbase
> (same
> > > as
> > > > --peer.adr argument for CopyTable)
> > > >
> > > > Job setup inside the class MyJob is as follows, note I am using
> > > > MultiTableOutputFormat.
> > > >
> > > > Configuration conf = HBaseConfiguration.addHbaseResources(getConf());
> > > > Job job = new Job(conf);
> > > > job.setJarByClass(MyJob.class);
> > > > job.setMapperClass(JsonImporterMapper.class);
> > > > // Note, several output tables!
> > > > job.setOutputFormatClass(MultiTableOutputFormat.class);
> > > > job.setNumReduceTasks(0);
> > > > TableMapReduceUtil.addDependencyJars(job);
> > > > TableMapReduceUtil.addDependencyJars(job.getConfiguration());
> > > >
> > > > Where The Mapper class has the following frame:
> > > >
> > > > public static class JsonImporterMapper extends
+
Ted Yu 2013-03-31, 19:38
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB