Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> Re: When copying a file to HDFS, how to control what nodes that file will reside on?


+
Raj Vishwanathan 2013-04-09, 21:56
Copy link to this message
-
Re: When copying a file to HDFS, how to control what nodes that file will reside on?
Which java file is responsible for replication?
Which file chooses random data node from same rack and which chooses random
rack?
On Wed, Apr 10, 2013 at 3:26 AM, Raj Vishwanathan <[EMAIL PROTECTED]> wrote:

> You could use the following facts.
> 1. Files are stored in blocks. So make your blocksize bigger than the
> largest file.
> 2, The first split is stored on the localnode.
>
> Raj
>
>   ------------------------------
> *From:* jeremy p <[EMAIL PROTECTED]>
> *To:* [EMAIL PROTECTED]
> *Sent:* Tuesday, April 9, 2013 1:49 PM
> *Subject:* When copying a file to HDFS, how to control what nodes that
> file will reside on?
>
> Hey all,
>
> I'm dealing with kind of a bizarre use case where I need to make sure that
> File A is local to Machine A, File B is local to Machine B, etc.  When
> copying a file to HDFS, is there a way to control which machines that file
> will reside on?  I know that any given file will be replicated across three
> machines, but I need to be able to say "File A will DEFINITELY exist on
> Machine A".  I don't really care about the other two machines -- they could
> be any machines on my cluster.
>
> Thank you.
>
>
>
--
*With regards ---*
*Mohammad Mustaqeem*,
M.Tech (CSE)
MNNIT Allahabad
9026604270
+
Patrick Angeles 2013-04-09, 20:57