Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> how to write hive query to solve this problem?


+
qiaoresearcher 2013-08-30, 23:34
Copy link to this message
-
Re: how to write hive query to solve this problem?
so this is not particular to Hive is it? You could post this on a DB2,
Oracle, or even Stackflow board i'd imagine.
On Fri, Aug 30, 2013 at 4:34 PM, qiaoresearcher <[EMAIL PROTECTED]>wrote:

>
> I have three tables:
>
> Table 1: record when and who visited gas station or not, this contains all
> the users of interest, name all the users as a set A
> date                         |      user name    |   visited gas station?
> 2013-09-01                         tom                             yes
> 2013-09-02                         tom                             yes
> 2013-09-01                         hanks                          yes
> 2013-09-03                         tomy                            yes
> ....                                           ...
>        ...
>
> Table 2: record when and who visited Bestbuy,   the user in set A appear
> here, but not all users of A will appear in this table, also table 2 has
> users does not belong to set A
> date                         |      user name    |   visited Bestbuy?
> 2013-09-01                         tom                             yes
> 2013-09-02                         jacob                           yes
> 2013-09-01                         hanks                          yes
> 2013-09-03                         michael                       yes
> ....                                           ...
>        ...
>
> Table 3: record when and who arrives one of three destinations: CA, NY and
> DC, the users in table 3 has similar situation as users in table 2
> regarding set A.
> date                         |      user name    |   visited Bestbuy or
> not
> 2013-09-01                         tom                             CA
> 2013-09-02                         tom                             NY
> 2013-09-01                         hanks                          DC
> 2013-09-03                         tomy                            CA
> ....                                           ...
>        ...
>
> Now we want to know,  within a 90 days period, what are the following
> numbers:
> (1) for any given day, for the users in table 1, how many of them has a
> path like:   visited Gas station first, after that date, user went to
> Bestbuy, finally arrive CA
> (2) for any given day, for the users in table 1, how many of them has a
> path like:   visited Gas station first, then did not visit Bestbuy, finally
> arrive CA
>
> (3) for any given day, for the users in table 1, how many of them has a
> path like:   visited Gas station first, then went Bestbuy, finally arrive NY
> (4) for any given day, for the users in table 1, how many of them has a
> path like:   visited Gas station first, then did not visit Bestbuy,
> finally arrive NY
>
> (5) for any given day, for the users in table 1, how many of them has a
> path like:   visited Gas station first, then went Bestbuy, finally arrive DC
> (6) for any given day, for the users in table 1, how many of them has a
> path like:   visited Gas station first, then did not visit Bestbuy,
> finally arrive DC
>
> It is possible in a given day, a user may visit Bestbuy many times for
> table 2, this will be simply regarded as a status 'visited Bestbuy'. One
> user will be counted once in a given day.
>  From the day of user visiting gas station, within next 90 days, this user
> has to arrive one of three final destinations in table 3, and arrives only
> one station. no multiple arrived stations.
>
> For example,
> a user tom may visit gas station on 9/1, then went to Bestbuy on 9/5,
> finally went to CA on 9/30, then this user tom can be counted as 1 on 9/1
> for path (1).
> a user hanks may visit gas station on 9/1, then went to Bestbuy on 9/5 and
> on 9/8, finally went to CA on 9/30, then this user tom count as 1 on 9/1
> for path (1).
> a user ruby may visit gas station on 9/1, and tables 2 does not have
> records to show ruby visit Bestbuy till 11/30, and ruby arrives DC in table
> 3 before 11/30, then ruby contribute 1 for path (6) for the day 9/1.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB