Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> ALL HDFS Blocks on the Same Machine if Replication factor = 1


+
Razen Al Harbi 2013-06-10, 13:36
Copy link to this message
-
Re: ALL HDFS Blocks on the Same Machine if Replication factor = 1
It's normal.  The default placement strategy stores the first block on the same node for performance, then choses a second random node on another rack, then a third node on the same rack as the second node.  Using a replication factor of 1 is not advised if you value your data.  However, if you want a better distribution of blocks with 1 replica then consider using a non-DN host to upload your files.

Daryn

On Jun 10, 2013, at 8:36 AM, Razen Al Harbi wrote:

> Hello,
>
> I have deployed Hadoop on a cluster of 20 machines. I set the replication factor to one. When I put a file (larger than HDFS block size) into HDFS, all the blocks are stored on the machine where the Hadoop put command is invoked.
>
> For higher replication factor, I see the same behavior but the replicated blocks are stored randomly on all the other machines.
>
> Is this a normal behavior, if not what would be the cause?
>
> Thanks,
>
> Razen
+
Kai Voigt 2013-06-10, 13:47
+
Shahab Yunus 2013-06-10, 13:57