Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> Error using hadoop in non-distributed mode


Copy link to this message
-
Re: Error using hadoop in non-distributed mode
Hi,

The path /tmp/hadoop-pat/mapred/local/archive/-4686065962599733460_1587570556_150738331/<snip>
is a location used by the tasktracker process for the 'DistributedCache' -
a mechanism to distribute files to all tasks running in a map reduce job. (
http://hadoop.apache.org/common/docs/r1.0.3/mapred_tutorial.html#DistributedCache
).

You have mentioned Mahout, so I am assuming that the specific analysis job
you are running is using this feature to distribute the output of the file /
Users/pat/Projects/big-data/b/ssvd/Q-job/R-m-00000 to the job that is
causing a failure.

Also, I find links stating the distributed cache does not work with in the
local (non-HDFS) mode. (
http://stackoverflow.com/questions/9148724/multiple-input-into-a-mapper-in-hadoop).
Look at the second answer.

Thanks
hemanth
On Tue, Sep 4, 2012 at 10:33 PM, Pat Ferrel <[EMAIL PROTECTED]> wrote:

> The job is creating several output and intermediate files all under the
> location: Users/pat/Projects/big-data/b/ssvd/ several output directories
> and files are created correctly and the
> file Users/pat/Projects/big-data/b/ssvd/Q-job/R-m-00000 is created and
> exists at the time of the error. We seem to be passing
> in Users/pat/Projects/big-data/b/ssvd/Q-job/R-m-00000 as the input file.
>
> Under what circumstances would an input path passed in as
> "Users/pat/Projects/big-data/b/ssvd/Q-job/R-m-00000" be turned into
> "pat/mapred/local/archive/6590995089539988730_1587570556_37122331/file/Users/pat/Projects/big-data/b/ssvd/Q-job/R-m-00000"
>
> ???
>
>
> On Sep 4, 2012, at 1:14 AM, Narasingu Ramesh <[EMAIL PROTECTED]>
> wrote:
>
> Hi Pat,
>             Please specify correct input file location.
> Thanks & Regards,
> Ramesh.Narasingu
>
> On Mon, Sep 3, 2012 at 9:28 PM, Pat Ferrel <[EMAIL PROTECTED]> wrote:
>
>> Using hadoop with mahout in a local filesystem/non-hdfs config for
>> debugging purposes inside Intellij IDEA. When I run one particular part of
>> the analysis I get the error below. I didn't write the code but we are
>> looking for some hint about what might cause it. This job completes without
>> error in a single node pseudo-clustered config outside of IDEA.
>>
>> several jobs in the pipeline complete without error creating part files
>> just fine in the local file system
>>
>> The file
>> /tmp/hadoop-pat/mapred/local/archive/6590995089539988730_1587570556_37122331/file/Users/pat/Projects/big-data/b/ssvd/Q-job/R-m-00000
>>
>> which is the subject of the error - does not exist
>>
>> Users/pat/Projects/big-data/b/ssvd/Q-job/R-m-00000
>>
>> does exist at the time of the error. So the code is looking for the data
>> in the wrong place?
>>
>> ….
>> 12/09/02 14:56:29 INFO compress.CodecPool: Got brand-new decompressor
>> 12/09/02 14:56:29 INFO compress.CodecPool: Got brand-new decompressor
>> 12/09/02 14:56:29 INFO compress.CodecPool: Got brand-new decompressor
>> 12/09/02 14:56:29 WARN mapred.LocalJobRunner: job_local_0002
>> java.io.FileNotFoundException: File
>> /tmp/hadoop-pat/mapred/local/archive/-4686065962599733460_1587570556_150738331/file/Users/pat/Projects/big-data/b/ssvd/Q-job/R-m-00000
>> does not exist.
>>         at
>> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:371)
>>         at
>> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:245)
>>         at
>> org.apache.mahout.common.iterator.sequencefile.SequenceFileDirValueIterator.<init>(SequenceFileDirValueIterator.java:92)
>>         at
>> org.apache.mahout.math.hadoop.stochasticsvd.BtJob$BtMapper.setup(BtJob.java:219)
>>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
>>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>>         at
>> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
>> Exception in thread "main" java.io.IOException: Bt job unsuccessful.
>>         at
>> org.apache.mahout.math.hadoop.stochasticsvd.BtJob.run(BtJob.java:609)
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB