Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Run queries from external files as subqueries

Copy link to this message
Re: Run queries from external files as subqueries
Quick and dirty way to do such thing would be to use some kind of
preprocessor. To avoid writing one, you could use e.g. the one from GCC,
with just a little help from sed:

    gcc -E -x c query.hql -o- | sed '/#/d' > preprocessed.hql
    hive -f preprocessed.hql

Where query.hql can contain for example something like

        #include "subquery.hql"
    ) t
    WHERE id = 1;

The includes can be nested and multiplied as much as necessary. As a bonus,
you could also use #define for repeated parts of code and/or #ifdef to
build different queries based on parameters parameters passed to gcc ;-)

Best regards,
Jan Dolinar
On Thu, Jun 20, 2013 at 10:09 PM, Bertrand Dechoux <[EMAIL PROTECTED]>wrote:

> I am afraid that there is no automatic way of doing so. But that would be
> the same answer whether the question is about hive or any relational
> database.
> (I would be glad to have counter examples.)
> You might want to look at oozie in order to manage worflow. But the
> creation of the worflow is manual indeed.
> http://oozie.apache.org/
> Regards
> Bertrand
> On Thu, Jun 20, 2013 at 9:59 PM, Sha Liu <[EMAIL PROTECTED]> wrote:
>> Hi,
>> While working on some complex queries with multiple level of subqueries,
>> I'm wonder if it is possible in Hive to refactor these subqueries into
>> different files and instruct the enclosing query to execute these files.
>> This way these subqueries can potentially be reused by other questions or
>> just run by themselves.
>> Thanks,
>> Sha Liu
> --
> Bertrand Dechoux