Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - ERROR 2118: Input path does not exist


Copy link to this message
-
Re: ERROR 2118: Input path does not exist
Thejas Nair 2011-09-21, 17:25
This is unlikely to be a configuration issue.
This query will result in a map-only job, and the number of part files
depends on the number of map tasks spawned. In typical configuration, in
pig mapreduce mode, it will be based on block size. Different number of
map tasks or part files should not cause a difference in results.

You might want to check for any difference in delimiters used in the
query. Having a look at the actual lines that are different might help
you figure out what is wrong.

Thanks,
Thejas

On 9/21/11 4:50 AM, kiranprasad wrote:
> Hi
>
> In windows system using Cygwin the out put I got were 35 files
> (part-m-00001 - 00035) with the same log file xyz.txt (1 GB size) and
> same filter
>
> using CYGWIN (Master)
> -----------
> grunt> A= LOAD 'data/xyz.txt' USING PigStorage();
> grunt> B= FILTER A BY ($0 matches '9948.*');
> grunt> STORE B INTO 'data/output2';
>
> using Linux VM (Master)
> ---------
> used the same script in this VM in local mode and mapred mode only 5
> files ((part-m-00001 - 00005) ) were generated as output and number of
> records also does nt match.
>
> grunt> A= LOAD 'data/DNDDB.txt' USING PigStorage();
> grunt> B= FILTER A BY ($0 matches '9948.*');
> grunt> STORE B INTO 'data/output2';
>
> I think I missed some configurations !
>
> Regards
>
> Kiran.G
>
> -----Original Message----- From: kiranprasad
> Sent: Wednesday, September 21, 2011 4:58 PM
> To: Thejas Nair ; [EMAIL PROTECTED]
> Subject: Re: ERROR 2118: Input path does not exist
>
> Now I am able to connect to HDFS and execute the PIG Latin scripts in
> mapred
> mode,
> but when I compared the results with local mode and mapred mode they are
> different.
>
> Regards
> Kiran.G
>
> -----Original Message----- From: Thejas Nair
> Sent: Wednesday, September 21, 2011 2:23 AM
> To: [EMAIL PROTECTED]
> Cc: kiranprasad
> Subject: Re: ERROR 2118: Input path does not exist
>
> The put command that Marek described can do that.
> http://hadoop.apache.org/common/docs/r0.20.0/hdfs_shell.html#put
>
> You will need to have hadoop client on that machine or move data to a
> machine that has it. Copying 10GB of data over a LAN (?) should not take
> too long.
>
> -Thejas
>
>
> On 9/20/11 12:22 AM, kiranprasad wrote:
>> How can I LOAD a file which is in another machine, of 10 GB size.
>>
>> -----Original Message----- From: Marek Miglinski
>> Sent: Tuesday, September 20, 2011 12:19 PM
>> To: [EMAIL PROTECTED]
>> Subject: RE: ERROR 2118: Input path does not exist
>>
>> Hey,
>>
>> '/data/test.txt' is supposed to be on hdfs (if your not executing with
>> -x local), put it there from your local drive with command:
>> hadoop fs -put
>>
>> for ex, create dir and the put:
>> hadoop fs -mkdir /data
>> hadoop fs -put /data/test.txt /data/
>>
>>
>> Sincerely,
>> Marek M.
>> ________________________________________
>> From: kiranprasad [[EMAIL PROTECTED]]
>> Sent: Tuesday, September 20, 2011 7:47 AM
>> To: [EMAIL PROTECTED]
>> Subject: Re: ERROR 2118: Input path does not exist
>>
>> Hi Marek
>>
>> I got the response as below
>>
>> [kiranprasad.g@pig4 bin]$ ./hadoop fs -ls /
>> Found 1 items
>> drwxr-xr-x - kiranprasad.g supergroup 0 2011-09-19 19:23 /tmp
>> but after loading (A= LOAD '/data/test.txt' USING PigStorage();),
>> I am getting the same exception.
>>
>> Message: org.apache.pig.backend.executionengine.
>> ExecException: ERROR 2118: Input path does not exist:
>> hdfs://10.0.0.61/data/msis
>> dns.txt
>> at
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInput
>> Format.getSplits(PigInputFormat.java:280)
>> at
>> org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885)
>> at
>> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:7
>> 79)
>> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730)
>> at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:378)
>> at
>> org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobCont
>> rol.java:247)
>> at
>> org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:27