Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hadoop >> mail # user >> Hadoop Question


Copy link to this message
-
Re: Hadoop Question
How about having the slave write to temp file first, then move it to the file the master is monitoring for after they close it?

-Joey

On Jul 27, 2011, at 22:51, Nitin Khandelwal <[EMAIL PROTECTED]> wrote:

> Hi All,
>
> How can I determine if a file is being written to (by any thread) in HDFS. I
> have a continuous process on the master node, which is tracking a particular
> folder in HDFS for files to process. On the slave nodes, I am creating files
> in the same folder using the following code :
>
> At the slave node:
>
> import org.apache.commons.io.IOUtils;
> import org.apache.hadoop.fs.FileSystem;
> import java.io.OutputStream;
>
> OutputStream oStream = fileSystem.create(path);
> IOUtils.write(<Some String>, oStream);
> IOUtils.closeQuietly(oStream);
>
>
> At the master node,
> I am getting the earliest modified file in the folder. At times when I try
> reading the file, I get nothing in the file, mostly because the slave might
> be still finishing writing to the file. Is there any way, to somehow tell
> the master, that the slave is still writing to the file and to check the
> file sometime later for actual content.
>
> Thanks,
> --
>
>
> Nitin Khandelwal
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB