Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Hadoop Question


Copy link to this message
-
Re: Hadoop Question
How about having the slave write to temp file first, then move it to the file the master is monitoring for after they close it?

-Joey

On Jul 27, 2011, at 22:51, Nitin Khandelwal <[EMAIL PROTECTED]> wrote:

> Hi All,
>
> How can I determine if a file is being written to (by any thread) in HDFS. I
> have a continuous process on the master node, which is tracking a particular
> folder in HDFS for files to process. On the slave nodes, I am creating files
> in the same folder using the following code :
>
> At the slave node:
>
> import org.apache.commons.io.IOUtils;
> import org.apache.hadoop.fs.FileSystem;
> import java.io.OutputStream;
>
> OutputStream oStream = fileSystem.create(path);
> IOUtils.write(<Some String>, oStream);
> IOUtils.closeQuietly(oStream);
>
>
> At the master node,
> I am getting the earliest modified file in the folder. At times when I try
> reading the file, I get nothing in the file, mostly because the slave might
> be still finishing writing to the file. Is there any way, to somehow tell
> the master, that the slave is still writing to the file and to check the
> file sometime later for actual content.
>
> Thanks,
> --
>
>
> Nitin Khandelwal