Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Run hive queries, and collect job information


Copy link to this message
-
Run hive queries, and collect job information
Hi folks,

I would like to run a list of generated HIVE queries. For each, I would
like to retrieve the MR job_id (or ids, in case of multiple stages). And
then, with this job_id, collect statistics from job tracker (cumulative
CPU, read bytes...)

How can I send HIVE queries from a bash or python script, and retrieve the
job_id(s) ?

For the 2nd part (collecting stats for the job), we're using a MRv1 Hadoop
cluster, so I don't have the AppMaster REST
API<http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/MapredAppMasterRest.html>.
I'm about to collect data from the jobtracker web UI. Any better idea ?

Mathieu
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB