Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> i am about to scrape a page

Patai Sangbutsarakum 2012-10-18, 21:47
Copy link to this message
RE: i am about to scrape a page
A little bit better than plain scraping..use lynx..
You don't have to parse HTML at least.
-----Original Message-----
From: Patai Sangbutsarakum [mailto:[EMAIL PROTECTED]]
Sent: Thursday, October 18, 2012 2:47 PM
Subject: i am about to scrape a page

I finding a way to retrieve info about what jobs are running by what user, and on what pool(s); i am on cdh3u4 with fair scheduler.
I do know that jobtracker_host:50030/scheduler   is  showing that, so
scraping the page would be one way and handle with html table.

Is that any other more civilized way, json format, command line ?
hadoop job -list doesn't show the pool.. that's pretty sad.

Input is really appreciate :-)