Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> Hadoop doesn't find the input file


Copy link to this message
-
Re: Hadoop doesn't find the input file
Can you pastebin the stack trace involving the NPE ?

Thanks

On Jan 4, 2014, at 9:25 AM, Manikandan Saravanan <[EMAIL PROTECTED]> wrote:

> Hi,
>
> I’m trying to run Nutch 2.2.1 on a Haddop 2-node cluster. My hadoop cluster is running fine and I’ve successfully added the input and output directory on to HDFS. But when I run
>
> $HADOOP_HOME/bin/hadoop jar /nutch/apache-nutch-2.2.1.job org.apache.nutch.crawl.Crawler urls -dir crawl -depth 3 -topN 5
>
> I’m getting something like:
>
> INFO input.FileInputFormat: Total input paths to process : 0
>
> Which, I understand, is meaning that Hadoop cannot locate the input files. The job ends for obvious reasons citing the null pointer exception. Can someone help me out?
>
> --
> Manikandan Saravanan
> Architect - Technology
> TheSocialPeople
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB