Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> hbase.mapred.output.quorum ignored in Mapper job with HDFS source and HBase sink


Copy link to this message
-
hbase.mapred.output.quorum ignored in Mapper job with HDFS source and HBase sink
Hello,

I want to import a file on HDFS from one cluster A (source) into HBase
tables on a different cluster B (destination) using a Mapper job with an
HBase sink. Both clusters run HBase.

This setup works fine:
- Run Mapper job on cluster B (destination)
- "mapred.input.dir" --> hdfs://<cluster-A>/<path-to-file> (file on source
cluster)
- "hbase.zookeeper.quorum" --> <quorum-hostname-B>
- "hbase.zookeeper.property.clientPort" --> <quorum-port-B>

I thought it should be possible to run the job on cluster A (source) and
using "hbase.mapred.output.quorum" to insert into the tables on cluster B.
This is what the CopyTable utility does. However, the following does not
work. HBase looks for the destination table(s) on cluster A and NOT cluster
B:
- Run Mapper job on cluster A (source)
- "mapred.input.dir" --> hdfs://<cluster-A>/<path-to-file> (file is local)
- "hbase.zookeeper.quorum" --> <quorum-hostname-A>
- "hbase.zookeeper.property.clientPort" --> <quorum-port-A>
- "hbase.mapred.output.quorum" -> <quorum-hostname-B>:2181:/hbase (same as
--peer.adr argument for CopyTable)

Job setup inside the class MyJob is as follows, note I am using
MultiTableOutputFormat.

Configuration conf = HBaseConfiguration.addHbaseResources(getConf());
Job job = new Job(conf);
job.setJarByClass(MyJob.class);
job.setMapperClass(JsonImporterMapper.class);
// Note, several output tables!
job.setOutputFormatClass(MultiTableOutputFormat.class);
job.setNumReduceTasks(0);
TableMapReduceUtil.addDependencyJars(job);
TableMapReduceUtil.addDependencyJars(job.getConfiguration());

Where The Mapper class has the following frame:

public static class JsonImporterMapper extends
    Mapper<LongWritable, Text, ImmutableBytesWritable, Put> { }

Is this expected behaviour? How can I get the second scenario using
hbase.mapred.output.quorum" to work? Could the fact I am using
MultiTableOutputFormat instead of TableOutputFormat play a part? I am using
HBase 0.92.1.

Thank you,

/David
+
Ted Yu 2013-03-24, 14:35
+
David Koch 2013-03-26, 15:03
+
Ted Yu 2013-03-27, 17:40
+
David Koch 2013-03-31, 17:14
+
Ted Yu 2013-03-31, 19:38