Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Help with last 30 day unique user query


Copy link to this message
-
Re: Help with last 30 day unique user query
Thanks Alex! That is exactly what I thought was the limitation but wanted to
make sure I'm not missing anything.

On Fri, Oct 15, 2010 at 10:51 AM, Alex Boisvert <[EMAIL PROTECTED]> wrote:

> As far as I know, Hive has no built-in support for sliding-window
> analytics.  There is an enhancement request here:
> https://issues.apache.org/jira/browse/HIVE-896
>
> <https://issues.apache.org/jira/browse/HIVE-896>Without such support, the
> brute force way of doing things is,
>
> SELECT COUNT(DISTINCT user_id) FROM events WHERE event_date > start_date
> and event_date <= end_date;
>
> (repeated N times to cover each day of your time window).
>
> alex
>
> On Thu, Oct 14, 2010 at 11:36 PM, Vijay <[EMAIL PROTECTED]> wrote:
>
>> Hi, I need help with this scenario. We have a table of events which has
>> columns date, event (not important for this discussion), and user_id. It is
>> obviously easy to find number of unique users for each day. I also need to
>> find number of unique users in the last 30 days for each day. This is also
>> quite simple to do for one day. However, I cannot figure out how to do this
>> for a range of days. Something like this is pretty straightforward in most
>> RDBMS but with HiveQL has I'm finding this hard. I might be missing
>> something simple though. Any help is appreciated. Ideally the query should
>> also be as optimized as possible as this table could be huge.
>>
>> Thanks,
>> Vijay
>>
>>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB