Home | About | Sematext search-lucene.com search-hadoop.com
clear query|facets|time Search criteria: .   Results from 31 to 40 from 65 (0.164s).
Loading phrases to help you
refine your search...
Re: Question from a Desperate Java Newbie - Hadoop - [mail # user]
...I totally obey the robots.txt since I am only fetching RSS feeds :-) I implemented my crawler with HttpClient and it is working fine. I often get messages about "Cookie rejected", but am abl...
   Author: edward choi, 2010-12-16, 06:14
Re: how to run jobs every 30 minutes? - Hadoop - [mail # user]
...Thanks for the tip. I took a look at it. Looks similar to Cascading I guess...? Anyway thanks for the info!!  Ed  2010/12/8 Alejandro Abdelnur  ...
   Author: edward choi, 2010-12-14, 05:26
Re: Is it possible to write file output in Map phase once and write another file output in Reduce phase? - Hadoop - [mail # user]
...Excuse me but could I ask one more question? Can I operate Bixo on a cluster other than Amazon EC2? I already am running a Hadoop cluster of my own, so I'd like run Bixo on to p of my cluste...
   Author: edward choi, 2010-12-11, 09:32
Re: Is it possible to write file output in Map phase once and write another file output in Reduce phase? - Hadoop - [mail # user]
...I'd start with only a few rss feeds at first, but I plan to expand it to th e scale of a thousands of rss feeds every 30 minutes eventually. That's why I am so eager to implement my system i...
   Author: edward choi, 2010-12-11, 08:30
Re: Question from a Desperate Java Newbie - Hadoop - [mail # user]
...Are you talking about java.net.HttpURLConnection? If so, I've already tried using that with getInputStream() function. But still no luck.  I actually got an interesting answer from Aard...
   Author: edward choi, 2010-12-10, 07:33
Re: Question from a Desperate Java Newbie - Hadoop - [mail # user]
...I would, but I am trying to integrate the crawler with Hadoop, so I wanted to write in Java :-)  2010/12/10 Santosh Borse  ...
   Author: edward choi, 2010-12-10, 07:29
Is it possible to write file output in Map phase once and write another file output in Reduce phase? - Hadoop - [mail # user]
...Hi,  I'm trying to crawl numerous news sites. My plan is to make a file containing a list of all the news rss feed urls, and the path to save the crawled news article. So it would be li...
   Author: edward choi, 2010-12-10, 07:27
Question from a Desperate Java Newbie - Hadoop - [mail # user]
...Excuse me for asking a general Java question here. I tried to find Java mailing list from Google but none of them were active.  There is a problem that's been driving me crazy for a whi...
   Author: edward choi, 2010-12-09, 11:05
Re: how to run jobs every 30 minutes? - Hadoop - [mail # user]
...My mistake. Come to think about it, you are right, I can just make an infinite loop inside the Hadoop application. Thanks for the reply.  2010/12/7 Harsh J  ...
   Author: edward choi, 2010-12-08, 05:53
Re: how to run jobs every 30 minutes? - Hadoop - [mail # user]
...Thanks for the reply, but I was hoping to load the program on the memory al l the time:-)  2010/12/7 li ping  ...
   Author: edward choi, 2010-12-08, 05:52
Sort:
project
Hadoop (65)
HBase (12)
type
mail # user (65)
date
last 7 days (0)
last 30 days (0)
last 90 days (0)
last 6 months (0)
last 9 months (65)
author
Harsh J (1387)
Steve Loughran (942)
Owen O'Malley (816)
Todd Lipcon (759)
Arun C Murthy (577)
Eli Collins (516)
Allen Wittenauer (461)
Konstantin Boudnik (347)
Doug Cutting (344)
Mark Kerzner (334)
Edward Capriolo (328)
Ted Dunning (321)
Brian Bockelman (305)
Tom White (304)
jason hadoop (279)
edward choi