-Re: How to join 2 tables using hadoop?
yonghu 2013-07-19, 09:35
You can write one MR job to finish this. First read two tables at Map
function, the output key will be the reference key for one table and
primary key for the other table. At the Reduce function, you can "join" the
tuples which contain the same key. Please note this is a very naive
approach, for more join optimization options, you can take a look at the
strategies which Pig or Hive uses.
On Fri, Jul 19, 2013 at 10:17 AM, Nitin Pawar <[EMAIL PROTECTED]>wrote:
> Try hive with hbase storage handler
> On Fri, Jul 19, 2013 at 9:54 AM, Pavan Sudheendra <[EMAIL PROTECTED]
> > Hi,
> > I know that HBase by default doesn't support table joins like RDBMS..
> > But anyway, I have a table who value contains a json with a particular
> > ID in it..
> > This id references another table where it is a key..
> > I want to fetch the id first from table A , query table 2 and get its
> > corresponding value..
> > What is the best way of achieving this using the MR framework?
> > Apologizes, i'm still new to Hadoop and HBase so please go easy on me.
> > Thanks for any help
> > --
> > Regards-
> > Pavan
> Nitin Pawar