|
|
-
counting number of pageviews and unique pageviews
Cam Bazz 2012-06-11, 12:45
Hello,
I have finally wrote a program to upload my data to amazon s3, start a cluster on amazon emr, and recover my partitions, and can issue simple queries on hive.
now I would like to:
select count(*),itemSid from items group by itemSid <- gives me how many times an item as viewed
and another query to extract unique views that i dont know how to yet.
how do I store the outputs of these queries,
such as:
itemSid, pageViews, uniquePageViews 1 10 8
commonsense tells me, store the results of query A, then query B, and then combine them in a table?
is that correct, and if so, how can i accomplish this?
Best Regards, C.B.
-
Re: counting number of pageviews and unique pageviews
Nitin Pawar 2012-06-11, 12:49
If you want to write back these results to some rdbms then you can use sqoop
if you want to save the results to some file, then just redirect the output of query to somefile
can you tell how are you executing the hive query from your code? that will be helpful to answer your question
On Mon, Jun 11, 2012 at 6:15 PM, Cam Bazz <[EMAIL PROTECTED]> wrote:
> Hello, > > I have finally wrote a program to upload my data to amazon s3, start a > cluster on amazon emr, and recover my partitions, and can issue simple > queries on hive. > > now I would like to: > > select count(*),itemSid from items group by itemSid <- gives me how > many times an item as viewed > > and another query to extract unique views that i dont know how to yet. > > how do I store the outputs of these queries, > > such as: > > itemSid, pageViews, uniquePageViews > 1 10 8 > > commonsense tells me, store the results of query A, then query B, and > then combine them in a table? > > is that correct, and if so, how can i accomplish this? > > Best Regards, > C.B. >
-- Nitin Pawar
-
Re: counting number of pageviews and unique pageviews
Cam Bazz 2012-06-11, 12:53
Hello Nitin,
yes, i want to write these results back to some rdbms, postgres, and i had written some sort of merge program to handle text data into postgres, but I will look at sqoop.
Currently there is no program, but I will either write a script, or one in java.
I will be making number of queries about the same item, and then have to combine those results in a table like
itemSid, measurementA, measurementB ...
I am concerned about combining those results, basically from hive queries i get:
for query A: itemSid, measurementA
for query B: itemSid, measurementB
...
etc.
Best. C.B. On Mon, Jun 11, 2012 at 3:49 PM, Nitin Pawar <[EMAIL PROTECTED]> wrote: > If you want to write back these results to some rdbms then you can use > sqoop > > if you want to save the results to some file, then just redirect the output > of query to somefile > > can you tell how are you executing the hive query from your code? that will > be helpful to answer your question > > > On Mon, Jun 11, 2012 at 6:15 PM, Cam Bazz <[EMAIL PROTECTED]> wrote: >> >> Hello, >> >> I have finally wrote a program to upload my data to amazon s3, start a >> cluster on amazon emr, and recover my partitions, and can issue simple >> queries on hive. >> >> now I would like to: >> >> select count(*),itemSid from items group by itemSid <- gives me how >> many times an item as viewed >> >> and another query to extract unique views that i dont know how to yet. >> >> how do I store the outputs of these queries, >> >> such as: >> >> itemSid, pageViews, uniquePageViews >> 1 10 8 >> >> commonsense tells me, store the results of query A, then query B, and >> then combine them in a table? >> >> is that correct, and if so, how can i accomplish this? >> >> Best Regards, >> C.B. > > > > > -- > Nitin Pawar >
-
Re: counting number of pageviews and unique pageviews
Nitin Pawar 2012-06-11, 14:47
why not write big joins and generate a single query which will get the expected results you want.
Or you can write queries and insert the intermediate data in temp tables and clean them up once your execution is over
On Mon, Jun 11, 2012 at 6:23 PM, Cam Bazz <[EMAIL PROTECTED]> wrote:
> Hello Nitin, > > yes, i want to write these results back to some rdbms, postgres, and i > had written some sort of merge program to handle text data into > postgres, but I will look at sqoop. > > Currently there is no program, but I will either write a script, or one in > java. > > I will be making number of queries about the same item, and then have > to combine those results in a table like > > itemSid, measurementA, measurementB ... > > I am concerned about combining those results, basically from hive queries > i get: > > for query A: > itemSid, measurementA > > for query B: > itemSid, measurementB > > ... > > etc. > > Best. > C.B. > > > On Mon, Jun 11, 2012 at 3:49 PM, Nitin Pawar <[EMAIL PROTECTED]> > wrote: > > If you want to write back these results to some rdbms then you can use > > sqoop > > > > if you want to save the results to some file, then just redirect the > output > > of query to somefile > > > > can you tell how are you executing the hive query from your code? that > will > > be helpful to answer your question > > > > > > On Mon, Jun 11, 2012 at 6:15 PM, Cam Bazz <[EMAIL PROTECTED]> wrote: > >> > >> Hello, > >> > >> I have finally wrote a program to upload my data to amazon s3, start a > >> cluster on amazon emr, and recover my partitions, and can issue simple > >> queries on hive. > >> > >> now I would like to: > >> > >> select count(*),itemSid from items group by itemSid <- gives me how > >> many times an item as viewed > >> > >> and another query to extract unique views that i dont know how to yet. > >> > >> how do I store the outputs of these queries, > >> > >> such as: > >> > >> itemSid, pageViews, uniquePageViews > >> 1 10 8 > >> > >> commonsense tells me, store the results of query A, then query B, and > >> then combine them in a table? > >> > >> is that correct, and if so, how can i accomplish this? > >> > >> Best Regards, > >> C.B. > > > > > > > > > > -- > > Nitin Pawar > > >
-- Nitin Pawar
|
|