Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> HBase MapReduce - Using mutiple tables as source


Copy link to this message
-
RE: HBase MapReduce - Using mutiple tables as source
A related question: may I have multiple tables as output, in a single Map
Job?
I understand that this is achievable by running multiple MR jobs, each
with a different output table specified in the reduce class. What I want
is to scan a source table once and generate multiple tables at one time.
Thanks,
Best Regards,
Wei

Wei Tan
Research Staff Member
IBM T. J. Watson Research Center
19 Skyline Dr, Hawthorne, NY  10532
[EMAIL PROTECTED]; 914-784-6752

From:   "Amlan Roy" <[EMAIL PROTECTED]>
To:     <[EMAIL PROTECTED]>,
Date:   08/06/2012 09:05 AM
Subject:        RE: HBase MapReduce - Using mutiple tables as source

Hi,

If TableMapper and TableMapReduceUtil.initTableMapperJob() does not
support
multiple tables as input, can I use Hadoop Mapper/Reducer classes and
specify the the input/output format myself?

What I want to do is, I want to read two tables in the map phase and want
to
reduce them together. What is the best solution available in 0.92.0 (I
understand the best solution is coming in version 0.96.0).

Regards,
Amlan

-----Original Message-----
From: Ioakim Perros [mailto:[EMAIL PROTECTED]]
Sent: Monday, August 06, 2012 5:11 PM
To: [EMAIL PROTECTED]
Subject: Re: HBase MapReduce - Using mutiple tables as source

Hi,

Isn't that the case that you can always initiate a scanner inside a map
job (referring to another table from which had been set into the
configuration of TableMapReduceUtil.initTableMapperJob(...) ) ?

Hope this serves as temporary solution.

On 08/06/2012 02:35 PM, Mohammad Tariq wrote:
> Hello Amlan,
>
>      Issue is still unresolved...Will get fixed in 0.96.0.
>
> Regards,
>      Mohammad Tariq
>
>
> On Mon, Aug 6, 2012 at 5:01 PM, Amlan Roy <[EMAIL PROTECTED]>
wrote:
>> Hi,
>>
>>
>>
>> While writing a MapReduce job for HBase, can I use multiple tables as
input?
>> I think TableMapReduceUtil.initTableMapperJob() takes a single table as
>> parameter. For my requirement, I want to specify multiple tables and
scan
>> instances. I read about MultiTableInputCollection in the document
>> https://issues.apache.org/jira/browse/HBASE-3996. But I don't find it
in
>> HBase-0.92.0.
>>
>>
>>
>> Regards,
>>
>> Amlan
>>