|
Amlan Roy
2012-08-06, 11:31
Mohammad Tariq
2012-08-06, 11:35
Ioakim Perros
2012-08-06, 11:40
Sonal Goyal
2012-08-06, 12:17
Amlan Roy
2012-08-06, 13:02
Wei Tan
2012-08-06, 14:22
Ferdy Galema
2012-08-06, 14:38
Stack
2012-08-06, 14:56
jmozah
2012-08-06, 16:04
|
-
HBase MapReduce - Using mutiple tables as sourceAmlan Roy 2012-08-06, 11:31
Hi,
While writing a MapReduce job for HBase, can I use multiple tables as input? I think TableMapReduceUtil.initTableMapperJob() takes a single table as parameter. For my requirement, I want to specify multiple tables and scan instances. I read about MultiTableInputCollection in the document https://issues.apache.org/jira/browse/HBASE-3996. But I don't find it in HBase-0.92.0. Regards, Amlan
-
Re: HBase MapReduce - Using mutiple tables as sourceMohammad Tariq 2012-08-06, 11:35
Hello Amlan,
Issue is still unresolved...Will get fixed in 0.96.0. Regards, Mohammad Tariq On Mon, Aug 6, 2012 at 5:01 PM, Amlan Roy <[EMAIL PROTECTED]> wrote: > Hi, > > > > While writing a MapReduce job for HBase, can I use multiple tables as input? > I think TableMapReduceUtil.initTableMapperJob() takes a single table as > parameter. For my requirement, I want to specify multiple tables and scan > instances. I read about MultiTableInputCollection in the document > https://issues.apache.org/jira/browse/HBASE-3996. But I don't find it in > HBase-0.92.0. > > > > Regards, > > Amlan >
-
Re: HBase MapReduce - Using mutiple tables as sourceIoakim Perros 2012-08-06, 11:40
Hi,
Isn't that the case that you can always initiate a scanner inside a map job (referring to another table from which had been set into the configuration of TableMapReduceUtil.initTableMapperJob(...) ) ? Hope this serves as temporary solution. On 08/06/2012 02:35 PM, Mohammad Tariq wrote: > Hello Amlan, > > Issue is still unresolved...Will get fixed in 0.96.0. > > Regards, > Mohammad Tariq > > > On Mon, Aug 6, 2012 at 5:01 PM, Amlan Roy <[EMAIL PROTECTED]> wrote: >> Hi, >> >> >> >> While writing a MapReduce job for HBase, can I use multiple tables as input? >> I think TableMapReduceUtil.initTableMapperJob() takes a single table as >> parameter. For my requirement, I want to specify multiple tables and scan >> instances. I read about MultiTableInputCollection in the document >> https://issues.apache.org/jira/browse/HBASE-3996. But I don't find it in >> HBase-0.92.0. >> >> >> >> Regards, >> >> Amlan >>
-
Re: HBase MapReduce - Using mutiple tables as sourceSonal Goyal 2012-08-06, 12:17
Hi Amlan,
I think if you share your usecase regarding two tables as inputs, people on the mailing list may be able to help you better. For example, are you looking at joining the two tables? What are the sizes of the tables etc? Best Regards, Sonal Crux: Reporting for HBase <https://github.com/sonalgoyal/crux> Nube Technologies <http://www.nubetech.co> <http://in.linkedin.com/in/sonalgoyal> On Mon, Aug 6, 2012 at 5:10 PM, Ioakim Perros <[EMAIL PROTECTED]> wrote: > Hi, > > Isn't that the case that you can always initiate a scanner inside a map > job (referring to another table from which had been set into the > configuration of TableMapReduceUtil.**initTableMapperJob(...) ) ? > > Hope this serves as temporary solution. > > On 08/06/2012 02:35 PM, Mohammad Tariq wrote: > >> Hello Amlan, >> >> Issue is still unresolved...Will get fixed in 0.96.0. >> >> Regards, >> Mohammad Tariq >> >> >> On Mon, Aug 6, 2012 at 5:01 PM, Amlan Roy <[EMAIL PROTECTED]> >> wrote: >> >>> Hi, >>> >>> >>> >>> While writing a MapReduce job for HBase, can I use multiple tables as >>> input? >>> I think TableMapReduceUtil.**initTableMapperJob() takes a single table >>> as >>> parameter. For my requirement, I want to specify multiple tables and scan >>> instances. I read about MultiTableInputCollection in the document >>> https://issues.apache.org/**jira/browse/HBASE-3996<https://issues.apache.org/jira/browse/HBASE-3996>. >>> But I don't find it in >>> HBase-0.92.0. >>> >>> >>> >>> Regards, >>> >>> Amlan >>> >>> >
-
RE: HBase MapReduce - Using mutiple tables as sourceAmlan Roy 2012-08-06, 13:02
Hi,
If TableMapper and TableMapReduceUtil.initTableMapperJob() does not support multiple tables as input, can I use Hadoop Mapper/Reducer classes and specify the the input/output format myself? What I want to do is, I want to read two tables in the map phase and want to reduce them together. What is the best solution available in 0.92.0 (I understand the best solution is coming in version 0.96.0). Regards, Amlan -----Original Message----- From: Ioakim Perros [mailto:[EMAIL PROTECTED]] Sent: Monday, August 06, 2012 5:11 PM To: [EMAIL PROTECTED] Subject: Re: HBase MapReduce - Using mutiple tables as source Hi, Isn't that the case that you can always initiate a scanner inside a map job (referring to another table from which had been set into the configuration of TableMapReduceUtil.initTableMapperJob(...) ) ? Hope this serves as temporary solution. On 08/06/2012 02:35 PM, Mohammad Tariq wrote: > Hello Amlan, > > Issue is still unresolved...Will get fixed in 0.96.0. > > Regards, > Mohammad Tariq > > > On Mon, Aug 6, 2012 at 5:01 PM, Amlan Roy <[EMAIL PROTECTED]> wrote: >> Hi, >> >> >> >> While writing a MapReduce job for HBase, can I use multiple tables as input? >> I think TableMapReduceUtil.initTableMapperJob() takes a single table as >> parameter. For my requirement, I want to specify multiple tables and scan >> instances. I read about MultiTableInputCollection in the document >> https://issues.apache.org/jira/browse/HBASE-3996. But I don't find it in >> HBase-0.92.0. >> >> >> >> Regards, >> >> Amlan >>
-
RE: HBase MapReduce - Using mutiple tables as sourceWei Tan 2012-08-06, 14:22
A related question: may I have multiple tables as output, in a single Map
Job? I understand that this is achievable by running multiple MR jobs, each with a different output table specified in the reduce class. What I want is to scan a source table once and generate multiple tables at one time. Thanks, Best Regards, Wei Wei Tan Research Staff Member IBM T. J. Watson Research Center 19 Skyline Dr, Hawthorne, NY 10532 [EMAIL PROTECTED]; 914-784-6752 From: "Amlan Roy" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]>, Date: 08/06/2012 09:05 AM Subject: RE: HBase MapReduce - Using mutiple tables as source Hi, If TableMapper and TableMapReduceUtil.initTableMapperJob() does not support multiple tables as input, can I use Hadoop Mapper/Reducer classes and specify the the input/output format myself? What I want to do is, I want to read two tables in the map phase and want to reduce them together. What is the best solution available in 0.92.0 (I understand the best solution is coming in version 0.96.0). Regards, Amlan -----Original Message----- From: Ioakim Perros [mailto:[EMAIL PROTECTED]] Sent: Monday, August 06, 2012 5:11 PM To: [EMAIL PROTECTED] Subject: Re: HBase MapReduce - Using mutiple tables as source Hi, Isn't that the case that you can always initiate a scanner inside a map job (referring to another table from which had been set into the configuration of TableMapReduceUtil.initTableMapperJob(...) ) ? Hope this serves as temporary solution. On 08/06/2012 02:35 PM, Mohammad Tariq wrote: > Hello Amlan, > > Issue is still unresolved...Will get fixed in 0.96.0. > > Regards, > Mohammad Tariq > > > On Mon, Aug 6, 2012 at 5:01 PM, Amlan Roy <[EMAIL PROTECTED]> wrote: >> Hi, >> >> >> >> While writing a MapReduce job for HBase, can I use multiple tables as input? >> I think TableMapReduceUtil.initTableMapperJob() takes a single table as >> parameter. For my requirement, I want to specify multiple tables and scan >> instances. I read about MultiTableInputCollection in the document >> https://issues.apache.org/jira/browse/HBASE-3996. But I don't find it in >> HBase-0.92.0. >> >> >> >> Regards, >> >> Amlan >>
-
Re: HBase MapReduce - Using mutiple tables as sourceFerdy Galema 2012-08-06, 14:38
Hi,
Perhaps you want to take a look at MultipleInputs. I'm not sure if it works for TableInputFormat, but at least you can use it for inspiration. Ferdy. On Mon, Aug 6, 2012 at 3:02 PM, Amlan Roy <[EMAIL PROTECTED]> wrote: > Hi, > > If TableMapper and TableMapReduceUtil.initTableMapperJob() does not support > multiple tables as input, can I use Hadoop Mapper/Reducer classes and > specify the the input/output format myself? > > What I want to do is, I want to read two tables in the map phase and want > to > reduce them together. What is the best solution available in 0.92.0 (I > understand the best solution is coming in version 0.96.0). > > Regards, > Amlan > > -----Original Message----- > From: Ioakim Perros [mailto:[EMAIL PROTECTED]] > Sent: Monday, August 06, 2012 5:11 PM > To: [EMAIL PROTECTED] > Subject: Re: HBase MapReduce - Using mutiple tables as source > > Hi, > > Isn't that the case that you can always initiate a scanner inside a map > job (referring to another table from which had been set into the > configuration of TableMapReduceUtil.initTableMapperJob(...) ) ? > > Hope this serves as temporary solution. > > On 08/06/2012 02:35 PM, Mohammad Tariq wrote: > > Hello Amlan, > > > > Issue is still unresolved...Will get fixed in 0.96.0. > > > > Regards, > > Mohammad Tariq > > > > > > On Mon, Aug 6, 2012 at 5:01 PM, Amlan Roy <[EMAIL PROTECTED]> > wrote: > >> Hi, > >> > >> > >> > >> While writing a MapReduce job for HBase, can I use multiple tables as > input? > >> I think TableMapReduceUtil.initTableMapperJob() takes a single table as > >> parameter. For my requirement, I want to specify multiple tables and > scan > >> instances. I read about MultiTableInputCollection in the document > >> https://issues.apache.org/jira/browse/HBASE-3996. But I don't find it > in > >> HBase-0.92.0. > >> > >> > >> > >> Regards, > >> > >> Amlan > >> > >
-
Re: HBase MapReduce - Using mutiple tables as sourceStack 2012-08-06, 14:56
On Mon, Aug 6, 2012 at 3:22 PM, Wei Tan <[EMAIL PROTECTED]> wrote:
> I understand that this is achievable by running multiple MR jobs, each > with a different output table specified in the reduce class. What I want > is to scan a source table once and generate multiple tables at one time. > Thanks, > There is nothing in HBase natively that will do this but no reason you can't do this in a map or reduce task. You'd set up two or more HTable instances on task init each pointing to a particular table. Then, inside in your task you'd send the puts to one of the possible HTables switching on whatever your fancy. St.Ack
-
Re: HBase MapReduce - Using mutiple tables as sourcejmozah 2012-08-06, 16:04
Its available just as a patch on trunk for now.
You wont find it in 0.92.0 ./zahoor On 06-Aug-2012, at 5:01 PM, Amlan Roy <[EMAIL PROTECTED]> wrote: > https://issues.apache.org/jira/browse/HBASE-3996 |