Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> querying objects and list fields


+
Lauren Blau 2013-01-25, 11:25
Copy link to this message
-
Re: querying objects and list fields
Use size instead of count.  Count is for counting rows, while size is for
determining the size of a collection.

For the second question, I think you'll need to call explode on the array,
turning it into records first. Google  for hive's "lateral view" to see the
correct syntax for exploding and also projecting other fields.

dean

On Fri, Jan 25, 2013 at 5:25 AM, Lauren Blau <
[EMAIL PROTECTED]> wrote:

> I'm building up a set of classes (objectinspectors and serdes) to allow
> hive queries over some data files I have. While I'm making it work, I don't
> fully grok all the concepts involved.
> Right now I've got 2 questions.
>
> I'm able to make queries like this (this is the first syntax I tried to
> query into what I know are lists of objects, is it the best way?):
> select messageId,lastmodifiedDate,contexts[1].conceptId from MessageData
> LIMIT 5; and get the conceptId of the first context element in the first 5
> rows/ (my messagedata contexts field is a list of context objects;)
> select messageId,lastmodifiedDate,contexts.conceptId from MessageData
> LIMIT 5; and get the conceptId of all the context elements in the first 5
> rows
>
> but I can't make a query like this
>
> select messageId,lastmodifiedDate,count(contexts) LIMIT 5;
>
> Is there a different syntax to query the length of that list of objects?
>
>
> Also, currently when you query
> select messageId, lastmodifiedDate,contexts LIMIT 1; you get a fully
> expanded representation of all of the contexts for 1 row back. What I'd
> really like is for that query to just return the list of contextIds (as if
> the query had been contexts.contextId), but then to be able to query down
> into the contexts like above. Is there some way my ObjectInspector could
> respond to
>
> select messageId, lastmodifiedDate,contexts;  as if it were select
> messageId,lastmodifiedDate.contexts.contextId
> but also still respond correctly to
> select messageId. lastmodifiedDate.contexts.conceptId
> ?
>
> Thanks for the help,
> Lauren
>
>
--
*Dean Wampler, Ph.D.*
thinkbiganalytics.com
+1-312-339-1330
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB