Evan Pollan 2012-02-05, 22:10
Hive 0.8.0 has metadata optimizations
but your best bet is to write a shell script that executes 'show partitions
<table_name>;', and then loop through the results and drop any partitions
that meet your criteria. You can then create a cron job to regularly
execute the shell script.
On Sun, Feb 5, 2012 at 5:10 PM, Evan Pollan <[EMAIL PROTECTED]>wrote:
> I have an environment where I'm partitioning data in some hive tables by
> day. I'd like to be able to delete data that's older than 1 week in some
> tables and 1 month in others. It appears that ALTER TABLE <t> DROP
> PARTITION only supports a partition spec that equates a partition with a
> literal value.
> I must be missing something obvious — how would I delete partitions that
> meet some criteria, rather than partitions that I'm able to reference via
> literal value? The only way I can think of doing this is by first running
> a hive query to get all partitions that should be deleted, then
> constructing a series of ALTER TABLE DROP PARTITION statements, each with a
> literal corresponding to one of the PARTITIONS, and executing those in a
> second phase.
> I'd like to be able to do this all within one HQL script…
Evan Pollan 2012-02-05, 22:53