Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> Hive client / thrift service query submission auditing


+
Matt Goeke 2012-09-12, 18:10
Copy link to this message
-
Re: Hive client / thrift service query submission auditing
Hey Matt,

We did something similar at Facebook to capture the information on who ran
what on the clusters and dumped that out to an audit db. Specifically we
were using Hive post execution hooks to achive that

http://hive.apache.org/docs/r0.7.0/api/org/apache/hadoop/hive/ql/hooks/PostExecute.html

this gets called from the hive cli mostly.

I am not sure if the particular hook that we had implemented was
contributed back, but this could potentially be a cool contribution :)

Ashish

On Wed, Sep 12, 2012 at 11:10 AM, Matt Goeke <[EMAIL PROTECTED]>wrote:

> All,
>
> I looked in the Hive JIRA and saw nothing like what we are looking to
> implement so I am interesting in getting feedback as to whether there is
> any overlap in this and any other current efforts:
>
> Currently our Hive warehouse is open to querying from any of our business
> analysts and we pool them by user in the fair scheduler to prevent someone
> from hogging cluster resources.  We are looking to start summarizing
> details of their queries so that we can view common questions they ask in
> order find ways to optimize our tables / submission process. One thought
> was to patch the Hive client / thrift server to write out the submitted
> queries to the DB that our metastore is on and from there we can perform
> some simple analytics to roll up a view of how they use the warehouse over
> time. This doesn't seem like it would be too difficult of an effort as the
> needed infrastructure is already in place but any suggestions or comments
> on this would be greatly appreciated. Also if this is interesting to anyone
> else we are happy to keep you in the loop as to any patches we create.
>
> --
> Matt Goeke
>
+
Matt Goeke 2012-09-12, 22:52
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB