-Re: Help with last 30 day unique user query
Vijay 2010-10-15, 18:19
Thanks Alex! That is exactly what I thought was the limitation but wanted to
make sure I'm not missing anything.
On Fri, Oct 15, 2010 at 10:51 AM, Alex Boisvert <[EMAIL PROTECTED]> wrote:
> As far as I know, Hive has no built-in support for sliding-window
> analytics. There is an enhancement request here:
> <https://issues.apache.org/jira/browse/HIVE-896>Without such support, the
> brute force way of doing things is,
> SELECT COUNT(DISTINCT user_id) FROM events WHERE event_date > start_date
> and event_date <= end_date;
> (repeated N times to cover each day of your time window).
> On Thu, Oct 14, 2010 at 11:36 PM, Vijay <[EMAIL PROTECTED]> wrote:
>> Hi, I need help with this scenario. We have a table of events which has
>> columns date, event (not important for this discussion), and user_id. It is
>> obviously easy to find number of unique users for each day. I also need to
>> find number of unique users in the last 30 days for each day. This is also
>> quite simple to do for one day. However, I cannot figure out how to do this
>> for a range of days. Something like this is pretty straightforward in most
>> RDBMS but with HiveQL has I'm finding this hard. I might be missing
>> something simple though. Any help is appreciated. Ideally the query should
>> also be as optimized as possible as this table could be huge.