java8964 java8964 2013-10-11, 17:05
one easiest way to do this is
create a table where each date maps to week of month, week of year, day of
week, day of month and then do the join on just date and put the conditions
on where clause.
Its easy to manipulate the date column for my understanding and you can
join just based on date and get results based on where conditions.
PS: this is what we currently do where we have to do continuous rollup
analytics for yeat to date or parameter to date calculations.
Wait for others to give you better solutions,
On Fri, Oct 11, 2013 at 10:35 PM, java8964 java8964 <[EMAIL PROTECTED]>wrote:
> I have the requirement trying to support in hive, not sure if it is doable.
> I have the hadoop 1.1.1 with Hive 0.9.0 (Using deby as the meta store)
> If I partition my data by a dt column, so if my table 'foo' have some
> partitions like 'dt=2013-07-01' to 'dt=2013-07-30'.
> Now the user want to query all the data of Saturday only.
> To make it flexiable, instead of asking end user to find out what date in
> that month are Saturday, I add a lookup table (just called it 'bar') in the
> HIVE with following columns:
> year, month, day, dt_format, week_of_day
> So I want to see if I can join with foo and bar to still get the partition
> select *
> from foo
> on (bar.year=2013 and bar.month=7 and bar.day_of_week=6 and bar.dt_foramt
> = foo.dt)
> I tried several ways, like switch the table order, join with subquery etc,
> none of them will make partition pruning works in this case on table foo.
> Can this really archivable in hive?