|
|
-
HBase MapReduce - Using mutiple tables as source
Amlan Roy 2012-08-06, 11:31
Hi, While writing a MapReduce job for HBase, can I use multiple tables as input? I think TableMapReduceUtil.initTableMapperJob() takes a single table as parameter. For my requirement, I want to specify multiple tables and scan instances. I read about MultiTableInputCollection in the document https://issues.apache.org/jira/browse/HBASE-3996. But I don't find it in HBase-0.92.0. Regards, Amlan
+
Amlan Roy 2012-08-06, 11:31
-
Re: HBase MapReduce - Using mutiple tables as source
Mohammad Tariq 2012-08-06, 11:35
Hello Amlan, Issue is still unresolved...Will get fixed in 0.96.0. Regards, Mohammad Tariq On Mon, Aug 6, 2012 at 5:01 PM, Amlan Roy <[EMAIL PROTECTED]> wrote: > Hi, > > > > While writing a MapReduce job for HBase, can I use multiple tables as input? > I think TableMapReduceUtil.initTableMapperJob() takes a single table as > parameter. For my requirement, I want to specify multiple tables and scan > instances. I read about MultiTableInputCollection in the document > https://issues.apache.org/jira/browse/HBASE-3996. But I don't find it in > HBase-0.92.0. > > > > Regards, > > Amlan >
+
Mohammad Tariq 2012-08-06, 11:35
-
Re: HBase MapReduce - Using mutiple tables as source
Ioakim Perros 2012-08-06, 11:40
Hi, Isn't that the case that you can always initiate a scanner inside a map job (referring to another table from which had been set into the configuration of TableMapReduceUtil.initTableMapperJob(...) ) ? Hope this serves as temporary solution. On 08/06/2012 02:35 PM, Mohammad Tariq wrote: > Hello Amlan, > > Issue is still unresolved...Will get fixed in 0.96.0. > > Regards, > Mohammad Tariq > > > On Mon, Aug 6, 2012 at 5:01 PM, Amlan Roy <[EMAIL PROTECTED]> wrote: >> Hi, >> >> >> >> While writing a MapReduce job for HBase, can I use multiple tables as input? >> I think TableMapReduceUtil.initTableMapperJob() takes a single table as >> parameter. For my requirement, I want to specify multiple tables and scan >> instances. I read about MultiTableInputCollection in the document >> https://issues.apache.org/jira/browse/HBASE-3996. But I don't find it in >> HBase-0.92.0. >> >> >> >> Regards, >> >> Amlan >>
+
Ioakim Perros 2012-08-06, 11:40
-
RE: HBase MapReduce - Using mutiple tables as source
Amlan Roy 2012-08-06, 13:02
Hi, If TableMapper and TableMapReduceUtil.initTableMapperJob() does not support multiple tables as input, can I use Hadoop Mapper/Reducer classes and specify the the input/output format myself? What I want to do is, I want to read two tables in the map phase and want to reduce them together. What is the best solution available in 0.92.0 (I understand the best solution is coming in version 0.96.0). Regards, Amlan -----Original Message----- From: Ioakim Perros [mailto:[EMAIL PROTECTED]] Sent: Monday, August 06, 2012 5:11 PM To: [EMAIL PROTECTED] Subject: Re: HBase MapReduce - Using mutiple tables as source Hi, Isn't that the case that you can always initiate a scanner inside a map job (referring to another table from which had been set into the configuration of TableMapReduceUtil.initTableMapperJob(...) ) ? Hope this serves as temporary solution. On 08/06/2012 02:35 PM, Mohammad Tariq wrote: > Hello Amlan, > > Issue is still unresolved...Will get fixed in 0.96.0. > > Regards, > Mohammad Tariq > > > On Mon, Aug 6, 2012 at 5:01 PM, Amlan Roy <[EMAIL PROTECTED]> wrote: >> Hi, >> >> >> >> While writing a MapReduce job for HBase, can I use multiple tables as input? >> I think TableMapReduceUtil.initTableMapperJob() takes a single table as >> parameter. For my requirement, I want to specify multiple tables and scan >> instances. I read about MultiTableInputCollection in the document >> https://issues.apache.org/jira/browse/HBASE-3996. But I don't find it in >> HBase-0.92.0. >> >> >> >> Regards, >> >> Amlan >>
+
Amlan Roy 2012-08-06, 13:02
-
RE: HBase MapReduce - Using mutiple tables as source
Wei Tan 2012-08-06, 14:22
A related question: may I have multiple tables as output, in a single Map Job? I understand that this is achievable by running multiple MR jobs, each with a different output table specified in the reduce class. What I want is to scan a source table once and generate multiple tables at one time. Thanks, Best Regards, Wei Wei Tan Research Staff Member IBM T. J. Watson Research Center 19 Skyline Dr, Hawthorne, NY 10532 [EMAIL PROTECTED]; 914-784-6752 From: "Amlan Roy" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]>, Date: 08/06/2012 09:05 AM Subject: RE: HBase MapReduce - Using mutiple tables as source Hi, If TableMapper and TableMapReduceUtil.initTableMapperJob() does not support multiple tables as input, can I use Hadoop Mapper/Reducer classes and specify the the input/output format myself? What I want to do is, I want to read two tables in the map phase and want to reduce them together. What is the best solution available in 0.92.0 (I understand the best solution is coming in version 0.96.0). Regards, Amlan -----Original Message----- From: Ioakim Perros [mailto:[EMAIL PROTECTED]] Sent: Monday, August 06, 2012 5:11 PM To: [EMAIL PROTECTED] Subject: Re: HBase MapReduce - Using mutiple tables as source Hi, Isn't that the case that you can always initiate a scanner inside a map job (referring to another table from which had been set into the configuration of TableMapReduceUtil.initTableMapperJob(...) ) ? Hope this serves as temporary solution. On 08/06/2012 02:35 PM, Mohammad Tariq wrote: > Hello Amlan, > > Issue is still unresolved...Will get fixed in 0.96.0. > > Regards, > Mohammad Tariq > > > On Mon, Aug 6, 2012 at 5:01 PM, Amlan Roy <[EMAIL PROTECTED]> wrote: >> Hi, >> >> >> >> While writing a MapReduce job for HBase, can I use multiple tables as input? >> I think TableMapReduceUtil.initTableMapperJob() takes a single table as >> parameter. For my requirement, I want to specify multiple tables and scan >> instances. I read about MultiTableInputCollection in the document >> https://issues.apache.org/jira/browse/HBASE-3996. But I don't find it in >> HBase-0.92.0. >> >> >> >> Regards, >> >> Amlan >>
+
Wei Tan 2012-08-06, 14:22
-
Re: HBase MapReduce - Using mutiple tables as source
Stack 2012-08-06, 14:56
On Mon, Aug 6, 2012 at 3:22 PM, Wei Tan <[EMAIL PROTECTED]> wrote: > I understand that this is achievable by running multiple MR jobs, each > with a different output table specified in the reduce class. What I want > is to scan a source table once and generate multiple tables at one time. > Thanks, >
There is nothing in HBase natively that will do this but no reason you can't do this in a map or reduce task. You'd set up two or more HTable instances on task init each pointing to a particular table. Then, inside in your task you'd send the puts to one of the possible HTables switching on whatever your fancy.
St.Ack
+
Stack 2012-08-06, 14:56
-
Re: HBase MapReduce - Using mutiple tables as source
Sonal Goyal 2012-08-06, 12:17
Hi Amlan, I think if you share your usecase regarding two tables as inputs, people on the mailing list may be able to help you better. For example, are you looking at joining the two tables? What are the sizes of the tables etc? Best Regards, Sonal Crux: Reporting for HBase < https://github.com/sonalgoyal/crux>Nube Technologies < http://www.nubetech.co>< http://in.linkedin.com/in/sonalgoyal>On Mon, Aug 6, 2012 at 5:10 PM, Ioakim Perros <[EMAIL PROTECTED]> wrote: > Hi, > > Isn't that the case that you can always initiate a scanner inside a map > job (referring to another table from which had been set into the > configuration of TableMapReduceUtil.**initTableMapperJob(...) ) ? > > Hope this serves as temporary solution. > > On 08/06/2012 02:35 PM, Mohammad Tariq wrote: > >> Hello Amlan, >> >> Issue is still unresolved...Will get fixed in 0.96.0. >> >> Regards, >> Mohammad Tariq >> >> >> On Mon, Aug 6, 2012 at 5:01 PM, Amlan Roy <[EMAIL PROTECTED]> >> wrote: >> >>> Hi, >>> >>> >>> >>> While writing a MapReduce job for HBase, can I use multiple tables as >>> input? >>> I think TableMapReduceUtil.**initTableMapperJob() takes a single table >>> as >>> parameter. For my requirement, I want to specify multiple tables and scan >>> instances. I read about MultiTableInputCollection in the document >>> https://issues.apache.org/**jira/browse/HBASE-3996<https://issues.apache.org/jira/browse/HBASE-3996>. >>> But I don't find it in >>> HBase-0.92.0. >>> >>> >>> >>> Regards, >>> >>> Amlan >>> >>> >
+
Sonal Goyal 2012-08-06, 12:17
-
Re: HBase MapReduce - Using mutiple tables as source
jmozah 2012-08-06, 16:04
Its available just as a patch on trunk for now. You wont find it in 0.92.0 ./zahoor On 06-Aug-2012, at 5:01 PM, Amlan Roy <[EMAIL PROTECTED]> wrote: > https://issues.apache.org/jira/browse/HBASE-3996
+
jmozah 2012-08-06, 16:04
-
Re: HBase MapReduce - Using mutiple tables as source
Ferdy Galema 2012-08-06, 14:38
Hi, Perhaps you want to take a look at MultipleInputs. I'm not sure if it works for TableInputFormat, but at least you can use it for inspiration. Ferdy. On Mon, Aug 6, 2012 at 3:02 PM, Amlan Roy <[EMAIL PROTECTED]> wrote: > Hi, > > If TableMapper and TableMapReduceUtil.initTableMapperJob() does not support > multiple tables as input, can I use Hadoop Mapper/Reducer classes and > specify the the input/output format myself? > > What I want to do is, I want to read two tables in the map phase and want > to > reduce them together. What is the best solution available in 0.92.0 (I > understand the best solution is coming in version 0.96.0). > > Regards, > Amlan > > -----Original Message----- > From: Ioakim Perros [mailto:[EMAIL PROTECTED]] > Sent: Monday, August 06, 2012 5:11 PM > To: [EMAIL PROTECTED] > Subject: Re: HBase MapReduce - Using mutiple tables as source > > Hi, > > Isn't that the case that you can always initiate a scanner inside a map > job (referring to another table from which had been set into the > configuration of TableMapReduceUtil.initTableMapperJob(...) ) ? > > Hope this serves as temporary solution. > > On 08/06/2012 02:35 PM, Mohammad Tariq wrote: > > Hello Amlan, > > > > Issue is still unresolved...Will get fixed in 0.96.0. > > > > Regards, > > Mohammad Tariq > > > > > > On Mon, Aug 6, 2012 at 5:01 PM, Amlan Roy <[EMAIL PROTECTED]> > wrote: > >> Hi, > >> > >> > >> > >> While writing a MapReduce job for HBase, can I use multiple tables as > input? > >> I think TableMapReduceUtil.initTableMapperJob() takes a single table as > >> parameter. For my requirement, I want to specify multiple tables and > scan > >> instances. I read about MultiTableInputCollection in the document > >> https://issues.apache.org/jira/browse/HBASE-3996. But I don't find it > in > >> HBase-0.92.0. > >> > >> > >> > >> Regards, > >> > >> Amlan > >> > >
+
Ferdy Galema 2012-08-06, 14:38
|
|