Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HDFS, mail # user - Re: MapReduce on Local files


+
Azuryy Yu 2013-04-03, 10:06
Copy link to this message
-
Re: MapReduce on Local files
Mohammad Tariq 2013-04-03, 10:16
Thank you Azuryy. It was about the files ending with a tilde "~".
These files are actually backup files, hidden to the users but my
job was able to see them. I am working on Ubuntu(Gnome DE).

Nothing serious, just out of curiosity :)

Warm Regards,
Tariq
https://mtariq.jux.com/
cloudfront.blogspot.com
On Wed, Apr 3, 2013 at 3:36 PM, Azuryy Yu <[EMAIL PROTECTED]> wrote:

> For FileInputFormat, start with "_" is hidden file by default. you can
> write a custom PathFilter, and pass it to the InputFormat.
>
>
> On Wed, Apr 3, 2013 at 5:58 PM, Harsh J <[EMAIL PROTECTED]> wrote:
>
>> You've been misled by the GUI you use, I'm afraid. Many DEs (Desktop
>> Environments) consider ~-suffix files as hidden but not the general
>> standards (try ls for example, or even shell expansions, it will
>> ignore . prefixes, but not ~ suffixes) :)
>>
>> To answer specifically though, no, the base FileInputFormat does not
>> recognize ~ today, but if you want it to, you can pass a custom path
>> filter to your InputFormat's implementation for when it calls the
>> listStatus method.
>>
>> On Wed, Apr 3, 2013 at 3:16 PM, Mohammad Tariq <[EMAIL PROTECTED]>
>> wrote:
>> > Hello Harsh,
>> >
>> >         Thank you for the response. I am sorry for being unclear.
>> > Actually I was talking about the backup files which end with "~"
>> > I mean these files are not visible normally, but my job is able to
>> > see them. Does FileInputFormat behave in the same way for "~"
>> > as it does in the case of "." and "_"?
>> >
>> > Thanks.
>> >
>> > Warm Regards,
>> > Tariq
>> > https://mtariq.jux.com/
>> > cloudfront.blogspot.com
>> >
>> >
>> > On Wed, Apr 3, 2013 at 7:45 AM, Harsh J <[EMAIL PROTECTED]> wrote:
>> >>
>> >> Not quite sure if I got your question. These tidbits may help though,
>> >> from what I can understand:
>> >>
>> >> * LocalFileSystem's listing uses Java's APIs for file/dir listing, and
>> >> has no concept of what a hidden file is on its own. It retrieves the
>> >> whole list.
>> >> * MR's FileInputFormat (and normal derivatives) does filter away "."
>> >> and "_" starting path names, from added input paths to the job.
>> >>
>> >> On Wed, Apr 3, 2013 at 3:09 AM, Mohammad Tariq <[EMAIL PROTECTED]>
>> wrote:
>> >> >
>> >> > Warm Regards,
>> >> > Tariq
>> >> > https://mtariq.jux.com/
>> >> > cloudfront.blogspot.com
>> >> >
>> >> >
>> >> > ---------- Forwarded message ----------
>> >> > From: Mohammad Tariq <[EMAIL PROTECTED]>
>> >> > Date: Tue, Apr 2, 2013 at 5:16 PM
>> >> > Subject: MapReduce on Local files
>> >> > To: [EMAIL PROTECTED]
>> >> >
>> >> >
>> >> > Hello list,
>> >> >
>> >> >            Is a MR job capable of reading even the hidden temp files
>> >> > present
>> >> > inside a directory located on my local FS?I have noticed this thing
>> >> > today
>> >> > for the first time because till now I never tried running MR jobs on
>> >> > local
>> >> > files.
>> >> >
>> >> > Thank you so much for your time?
>> >> >
>> >> > Warm Regards,
>> >> > Tariq
>> >> > https://mtariq.jux.com/
>> >> > cloudfront.blogspot.com
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Harsh J
>> >
>> >
>>
>>
>>
>> --
>> Harsh J
>>
>
>
+
Mohammad Tariq 2013-04-03, 10:03