Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Multiple scan input split for MR job


Copy link to this message
-
Re: Multiple scan input split for MR job
Nothing official AFAIK, looks like you understand what your other solution is.

J-D

On Wed, Aug 8, 2012 at 5:41 PM, Eric Czech <[EMAIL PROTECTED]> wrote:
> Hi everyone,
>
> I've been searching for a way to specify an MR job on an HBase table
> using multiple key ranges (instead of just one), and as far as I can
> tell, the best way is still to create a custom InputFormat like
> MultiSegmentTableInputFormat and override getSplits to return splits
> based on multiple scan objects.
>
> Is this still the best way to do this or is there any official support yet?
>
> If it is still the best way to do it, does anyone have an
> implementation of this that they'd be willing to share?  I'm new to
> HBase and I'm not so sure I'd be able to do that well myself.
>
> Thank you for your time!
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB