|
|
-
Run hive queries, and collect job informationMathieu Despriee 2013-01-30, 10:03
Hi folks,
I would like to run a list of generated HIVE queries. For each, I would like to retrieve the MR job_id (or ids, in case of multiple stages). And then, with this job_id, collect statistics from job tracker (cumulative CPU, read bytes...) How can I send HIVE queries from a bash or python script, and retrieve the job_id(s) ? For the 2nd part (collecting stats for the job), we're using a MRv1 Hadoop cluster, so I don't have the AppMaster REST API<http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/MapredAppMasterRest.html>. I'm about to collect data from the jobtracker web UI. Any better idea ? Mathieu +
Qiang Wang 2013-01-30, 10:25
+
Nitin Pawar 2013-01-30, 11:30
+
Mathieu Despriee 2013-01-30, 13:52
|