Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> i am about to scrape a page


Copy link to this message
-
RE: i am about to scrape a page
A little bit better than plain scraping..use lynx..
You don't have to parse HTML at least.
Thanks,
Abhishek
-----Original Message-----
From: Patai Sangbutsarakum [mailto:[EMAIL PROTECTED]]
Sent: Thursday, October 18, 2012 2:47 PM
To: [EMAIL PROTECTED]
Subject: i am about to scrape a page

I finding a way to retrieve info about what jobs are running by what user, and on what pool(s); i am on cdh3u4 with fair scheduler.
I do know that jobtracker_host:50030/scheduler   is  showing that, so
scraping the page would be one way and handle with html table.

Is that any other more civilized way, json format, command line ?
hadoop job -list doesn't show the pool.. that's pretty sad.

Input is really appreciate :-)

Thanks
Patai
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB