Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> Question about writing HDFS files


+
John Lilley 2013-05-16, 22:08
+
Harsh J 2013-05-16, 22:53
Copy link to this message
-
Re: Question about writing HDFS files
Hi Harsh,

I think what John meant by writing to local disk is writing to the same
data node first which has initiated the write call.

John can further clarify.
On Fri, May 17, 2013 at 4:23 AM, Harsh J <[EMAIL PROTECTED]> wrote:

> That is not true. HDFS writes are not staged to a local disk first
> before being written onto the DataNodes. The old architecture docs
> seem to suggest that the writes get staged to a local disk but thats
> not true anymore, see https://issues.apache.org/jira/browse/HDFS-1454.
>
> Also worth noting that a HDFS client behaves the same way in almost
> all contexts, whether its invoked from an MR framework or directly
> from shell.
>
> On Fri, May 17, 2013 at 3:38 AM, John Lilley <[EMAIL PROTECTED]>
> wrote:
> > I seem to recall reading that when a MapReduce task writes a file, the
> > blocks of the file are always written to local disk, and replicated to
> other
> > nodes.  If this is true, is this also true for non-MR applications
> writing
> > to HDFS from Hadoop worker nodes?  What about clients outside of the
> cluster
> > doing a file load?
> >
> > Thanks
> >
> > John
> >
> >
>
>
>
> --
> Harsh J
>
+
John Lilley 2013-05-17, 13:38
+
J. Rottinghuis 2013-05-17, 15:24