Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS, mail # user - Program trying to read from local instead of hdfs


Copy link to this message
-
Re: Program trying to read from local instead of hdfs
Jay Vyas 2013-01-18, 00:16
One very ugly way to confirm that its a problem with your config is to add
the config properties in code.
The problem with the Configuration object is it doesnt tell you if the path
to the file is bad :( I really beleive this should be changed because it is
a major cause of frustration.

Worst case scenario you can just grep out all the configuration properties
and do conf.set("property","value") , and then pass in that configuration
to your FileSystem.get(..) call, but thats not very portable.
Clearly, the resource adding in the Configuration API is quite sensitive..
On Thu, Jan 17, 2013 at 7:06 PM, jamal sasha <[EMAIL PROTECTED]> wrote:

> No.
>  Its not working. :( same error.
>
>
>
> On Thu, Jan 17, 2013 at 4:00 PM, Mohammad Tariq <[EMAIL PROTECTED]>wrote:
>
>> Hello Jamal,
>>
>>     Add the following 2 lines in your code and see if it works for you :
>>
>> Configuration.addResource(new Path("PATH_TO_YOUR_core-site.xml"));
>> Configuration.addResource(new Path("PATH_TO_YOUR_hdfs-site.xml"));
>>
>> Warm Regards,
>> Tariq
>> https://mtariq.jux.com/
>> cloudfront.blogspot.com
>>
>>
>> On Fri, Jan 18, 2013 at 5:26 AM, jamal sasha <[EMAIL PROTECTED]>wrote:
>>
>>> Hi,
>>>   I am not sure what I am doing wrong.
>>> I copy my input files from local to hdfs at local
>>>
>>> /user/hduser/data/input1.txt
>>> /user/hduser/data/input2.txt
>>>
>>> In my driver code: I have
>>>  MultipleInputs.addInputPath(conf, new Path(args[0]),
>>> TextInputFormat.class, UserFileMapper.class);
>>>               MultipleInputs.addInputPath(conf, new Path(args[1]),
>>> TextInputFormat.class, DeliveryFileMapper.class);
>>>
>>> And then when i try to run the code I get an error:
>>> Exception in thread "main" java.io.FileNotFoundException: File
>>> /user/hduser/data/input1.txt does not exist.
>>>
>>> And lets say on my local I have the path as
>>> /Users/local/project/input1.txt
>>> /Users/local/project/input2.txt
>>> And then when I try to run the file it throws an error
>>> 13/01/17 15:26:06 INFO mapred.JobClient: Cleaning up the staging area
>>> hdfs://localhost:54310/app/hadoop/tmp/mapred/staging/hduser/.staging/job_201301021121_0151
>>> 13/01/17 15:26:06 ERROR security.UserGroupInformation:
>>> PriviledgedActionException as:mhduser
>>> cause:org.apache.hadoop.mapred.InvalidInputException: Input path does not
>>> exist: hdfs://localhost:54310/Users/local/project/input1.txt
>>>
>>> Well offcourse, its not in there.. its on my local.
>>> So when I give the hdfs localtion.. it reads thru local and when i give
>>> my local location.. it is pointing to hdfs..
>>> What am I doing wrong?
>>> For reference, I am trying to run this code:
>>> http://kickstarthadoop.blogspot.com/2011/09/joins-with-plain-map-reduce.html
>>> THanks
>>>
>>>
>>
>
--
Jay Vyas
http://jayunit100.blogspot.com