Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> MapReduce job with mixed data sources: HBase table and HDFS files


Copy link to this message
-
Re: MapReduce job with mixed data sources: HBase table and HDFS files
You may want to pull your data from your HBase first in a separate map only job and then use its output along with other HDFS input.  
There is a significant disparity between the reads from HDFS and from HBase.
On Jul 3, 2013, at 10:34 AM, S. Zhou <[EMAIL PROTECTED]> wrote:

> Azuryy, I am looking at the MultipleInputs doc. But I could not figure out how to add HBase table as a Path to the input? Do you have some sample code? Thanks!
>
>
>
>
> ________________________________
> From: Azuryy Yu <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]; S. Zhou <[EMAIL PROTECTED]>
> Sent: Tuesday, July 2, 2013 10:06 PM
> Subject: Re: MapReduce job with mixed data sources: HBase table and HDFS files
>
>
> Hi ,
>
> Use MultipleInputs, which can solve your problem.
>
>
> On Wed, Jul 3, 2013 at 12:34 PM, S. Zhou <[EMAIL PROTECTED]> wrote:
>
>> Hi there,
>>
>> I know how to create MapReduce job with HBase data source only or HDFS
>> file as data source. Now I need to create a MapReduce job with mixed data
>> sources, that is, this MR job need to read data from both HBase and HDFS
>> files. Is it possible? If yes, could u share some sample code?
>>
>> Thanks!
>> Senqiang
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB