Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> question about hive SQL


Copy link to this message
-
Re: question about hive SQL
Here is my stab at it. I have not tested it but this should get you started

Following points are importat

1. I added a WHERE clause in the sub query to limit he data set by any partition u may have
2. You have to write a collect UDF to use it. Wampler/Capriolo's book in Chapter 13.Functions - refer the class GenericUDAFCollect

SELECT
     page_url,
     token,
     collect(concat_ws('|', pcw. original_category, pcw.weight))
FROM
     (SELECT
          page_url,
                  token,
                  original_category,
                  weight
     FROM
                 media_visit_info)
     WHERE
                 partition_column='partition_col_val'
     GROUP BY
                 original_category,
                 weight
     ) pcw

LIMIT 10
;

From: ch huang <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Reply-To: "[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>" <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Date: Monday, August 19, 2013 2:04 AM
To: "[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>" <[EMAIL PROTECTED]<mailto:[EMAIL PROTECTED]>>
Subject: question about hive SQL

hi,all:
       i do not very familar with HQL, and my problem is ,now i have 2 queries

Q1: select page_url, original_category,token from media_visit_info group by page_url, original_category,token limit 10
Q2:  select original_category as code , weight from media_visit_info where page_url='X' group by original_category,weight;

Q1  page_url value should be send to Q2 where condition ,and the two query result should be combined like

{
url:http\\:www.baidu.com,
category:|CN10,
token:20,
categorys:
[
{code:|CN10-1-1,weight:0.5},
{code:|CN11-2-2,weight:0.1},
{code:|CN10-1-3,weight:0.02}
]
}

i do not know if it can write into one query(JOIN+SUBQUERY??) ,any one can help?

CONFIDENTIALITY NOTICE
=====================This email message and any attachments are for the exclusive use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message along with any attachments, from your computer system. If you are the intended recipient, please be advised that the content of this message is subject to access, review and disclosure by the sender's Email System Administrator.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB