Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive, mail # user - Run queries from external files as subqueries

Sha Liu 2013-06-20, 19:59
Bertrand Dechoux 2013-06-20, 20:09
Copy link to this message
Re: Run queries from external files as subqueries
Jan Dolinár 2013-06-20, 20:54
Quick and dirty way to do such thing would be to use some kind of
preprocessor. To avoid writing one, you could use e.g. the one from GCC,
with just a little help from sed:

    gcc -E -x c query.hql -o- | sed '/#/d' > preprocessed.hql
    hive -f preprocessed.hql

Where query.hql can contain for example something like

        #include "subquery.hql"
    ) t
    WHERE id = 1;

The includes can be nested and multiplied as much as necessary. As a bonus,
you could also use #define for repeated parts of code and/or #ifdef to
build different queries based on parameters parameters passed to gcc ;-)

Best regards,
Jan Dolinar
On Thu, Jun 20, 2013 at 10:09 PM, Bertrand Dechoux <[EMAIL PROTECTED]>wrote:

> I am afraid that there is no automatic way of doing so. But that would be
> the same answer whether the question is about hive or any relational
> database.
> (I would be glad to have counter examples.)
> You might want to look at oozie in order to manage worflow. But the
> creation of the worflow is manual indeed.
> http://oozie.apache.org/
> Regards
> Bertrand
> On Thu, Jun 20, 2013 at 9:59 PM, Sha Liu <[EMAIL PROTECTED]> wrote:
>> Hi,
>> While working on some complex queries with multiple level of subqueries,
>> I'm wonder if it is possible in Hive to refactor these subqueries into
>> different files and instruct the enclosing query to execute these files.
>> This way these subqueries can potentially be reused by other questions or
>> just run by themselves.
>> Thanks,
>> Sha Liu
> --
> Bertrand Dechoux