Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> Rolling MAU computation


+
Tom Hubina 2012-10-10, 22:05
+
Roberto Sanabria 2012-10-10, 22:59
+
Tom Hubina 2012-10-10, 23:04
+
MiaoMiao 2012-10-11, 03:50
Copy link to this message
-
Re: Rolling MAU computation
The problem is that "day" is the value in the for loop.

I've tried doing a join with a table that contains the set of days, but the
problem is that you can't do a join on a range ... Hive only support
equality in the join. For example:

INSERT OVERWRITE TABLE mausummary SELECT day, COUNT(DISTINCT(userid))
    FROM days
    JOIN logins ON date_add(logins.t, 30) >= days.day AND logins.t <days.day
    GROUP BY day;

fails because of the range in the join.

Tom
On Wed, Oct 10, 2012 at 8:50 PM, MiaoMiao <[EMAIL PROTECTED]> wrote:

> How about
> SELECT day, COUNT(DISTINCT(userid)) FROM logins WHERE day - logins.day
> < 30 GROUP BY day;
>
> On Thu, Oct 11, 2012 at 6:05 AM, Tom Hubina <[EMAIL PROTECTED]> wrote:
> > I'm trying to compute the number of active users in the previous 30 days
> for
> > each day over a date range. I can't think of any way to do it directly
> > within Hive so I'm wondering if you guys have any ideas.
> >
> > Basically the algorithm is something like:
> >
> > For each day in date range:
> >    SELECT day, COUNT(DISTINCT(userid)) FROM logins WHERE day -
> logins.day <
> > 30;
> >
> > Thanks for your help!
> >
> > Tom
> >
>
+
Igor Tatarinov 2012-10-11, 06:05
+
Tom Hubina 2012-10-12, 20:02
+
Igor Tatarinov 2012-10-12, 20:08
+
Vijay 2012-10-12, 20:42
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB