Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Hbase sequential row merging in MapReduce job


Copy link to this message
-
Re: Hbase sequential row merging in MapReduce job

As long as you know your keyspace, you should be able to create your own
splits.  See TableInputFormatBase for the default implementation (which is
1 input split per region)

On 10/19/12 9:32 AM, "Eric Czech" <[EMAIL PROTECTED]> wrote:

>Hi everyone,
>
>Is there any way to create an InputSplit for a MapReduce job (reading from
>an HBase table) that guarantees sequential rows with some shared key
>prefix
>will end up in the same mapper?
>
>For example, if I have sequential keys like this:
>
>metric1_2010,
>metric1_2011,
>metric1_2012,
>metric2_2011,
>metric2_2012,
>...
>
>I want a mapper that will definitely see all the rows with keys that start
>with "metric1".
>
>Is there a way to do this?
>
>Thank you!
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB