Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> multitable query


Copy link to this message
-
Re: multitable query
Use 3 jobs: 1 to scan each table. The third could do a map-side join. Make sure to use the same sort and partitions on the first two.

Sent from iPhone.

On Aug 10, 2012, at 9:41 AM, Weishung Chung <[EMAIL PROTECTED]> wrote:

> but they are in production now
>
> On Fri, Aug 10, 2012 at 6:39 AM, Weishung Chung <[EMAIL PROTECTED]> wrote:
>
>> Thank you, I am trying to avoid to fetch by gets and would like to do
>> something like hadoop MultipleInputs.
>> Yes, it would be nice if i could denormalize and remodel the schema.
>>
>>
>> On Fri, Aug 10, 2012 at 6:29 AM, Amandeep Khurana <[EMAIL PROTECTED]>wrote:
>>
>>> You can scan over one of the tables (using TableInputFormat) and do simple
>>> gets on the other table for every row that you want to join.
>>>
>>> An interesting question to address here would be - why even need a join.
>>> Can you talk more about the data and what you are trying to do? In general
>>> you really want to denormalize and not need joins when working with HBase
>>> (or for that matter most NoSQL stores).
>>>
>>> On Fri, Aug 10, 2012 at 6:52 PM, Weishung Chung <[EMAIL PROTECTED]>
>>> wrote:
>>>
>>>> Basically a join of two data sets on the same row key.
>>>>
>>>> On Fri, Aug 10, 2012 at 6:12 AM, Amandeep Khurana <[EMAIL PROTECTED]>
>>>> wrote:
>>>>
>>>>> How do you want to use two tables? Can you explain your algo a bit?
>>>>>
>>>>> On Fri, Aug 10, 2012 at 6:40 PM, Weishung Chung <[EMAIL PROTECTED]>
>>>>> wrote:
>>>>>
>>>>>> Hi HBase users,
>>>>>>
>>>>>> I need to pull data from 2 HBase tables in a mapreduce job. For 1
>>> table
>>>>>> input, I use TableMapReduceUtil.initTableMapperJob. Is there another
>>>>> method
>>>>>> for multitable inputs ?
>>>>>>
>>>>>> Thank you,
>>>>>> Wei Shung
>>>>>>
>>>>>
>>>>
>>>
>>
>>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB