Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive, mail # user - Quering RDBMS table in a Hive query


Copy link to this message
-
Re: Quering RDBMS table in a Hive query
Jan Dolinár 2012-06-14, 06:03
Hi Ruslan,

I've been in similar situation and solved it by writing a custom
InputFormat and LineReader that loads the data from MySQL in
constructor. In my case I use it just to check value ranges and
similar stuff. If you want to join the data with whats in your hdfs
files, you can do that as well, InputFormat allows you to add the
columns easily. I'm not sure how well this solution would behave for a
bigger data, but for small data (I load about 5 tables, ~100 lines
each) it works just fine.

Best Regards,
Jan

On 6/13/12, Ruslan Al-Fakikh <[EMAIL PROTECTED]> wrote:
> Hello to everyone,
>
> I need to join hdfs data with little data taken from RDBMS. A possible
> solution is to import RDBMS data to a regular hive table using Sqoop,
> but this way I'll have to keep that imported hive table up-to-date
> which means that I will have to update it every time before joining in
> a query.
> Is there a way to load RDBMS data on the fly? Maybe a UDF which would
> take RDBMS connection properties and load the data?
>
> Thanks in advance,
> Ruslan Al-Fakikh
>