Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - MapReduce job with mixed data sources: HBase table and HDFS files


Copy link to this message
-
Re: MapReduce job with mixed data sources: HBase table and HDFS files
Michael Segel 2013-07-03, 21:19
You may want to pull your data from your HBase first in a separate map only job and then use its output along with other HDFS input.  
There is a significant disparity between the reads from HDFS and from HBase.
On Jul 3, 2013, at 10:34 AM, S. Zhou <[EMAIL PROTECTED]> wrote:

> Azuryy, I am looking at the MultipleInputs doc. But I could not figure out how to add HBase table as a Path to the input? Do you have some sample code? Thanks!
>
>
>
>
> ________________________________
> From: Azuryy Yu <[EMAIL PROTECTED]>
> To: [EMAIL PROTECTED]; S. Zhou <[EMAIL PROTECTED]>
> Sent: Tuesday, July 2, 2013 10:06 PM
> Subject: Re: MapReduce job with mixed data sources: HBase table and HDFS files
>
>
> Hi ,
>
> Use MultipleInputs, which can solve your problem.
>
>
> On Wed, Jul 3, 2013 at 12:34 PM, S. Zhou <[EMAIL PROTECTED]> wrote:
>
>> Hi there,
>>
>> I know how to create MapReduce job with HBase data source only or HDFS
>> file as data source. Now I need to create a MapReduce job with mixed data
>> sources, that is, this MR job need to read data from both HBase and HDFS
>> files. Is it possible? If yes, could u share some sample code?
>>
>> Thanks!
>> Senqiang