-Re: When copying a file to HDFS, how to control what nodes that file will reside on?
Which java file is responsible for replication?
Which file chooses random data node from same rack and which chooses random
On Wed, Apr 10, 2013 at 3:26 AM, Raj Vishwanathan <[EMAIL PROTECTED]> wrote:
> You could use the following facts.
> 1. Files are stored in blocks. So make your blocksize bigger than the
> largest file.
> 2, The first split is stored on the localnode.
> *From:* jeremy p <[EMAIL PROTECTED]>
> *To:* [EMAIL PROTECTED]
> *Sent:* Tuesday, April 9, 2013 1:49 PM
> *Subject:* When copying a file to HDFS, how to control what nodes that
> file will reside on?
> Hey all,
> I'm dealing with kind of a bizarre use case where I need to make sure that
> File A is local to Machine A, File B is local to Machine B, etc. When
> copying a file to HDFS, is there a way to control which machines that file
> will reside on? I know that any given file will be replicated across three
> machines, but I need to be able to say "File A will DEFINITELY exist on
> Machine A". I don't really care about the other two machines -- they could
> be any machines on my cluster.
> Thank you.
*With regards ---*