-Very large file copied to cluster, and the copy fails. All blocks bad
Saptarshi Guha 2009-02-13, 02:50
I have a 42 GB file on the local fs(call the machine A) which i need
to copy to a HDFS (replicattion 1), according the HDFS webtracker it
has 208GB across 7 machines.
Note, the machine A has about 80 GB total, so there is no place to
store copies of the file.
Using the command bin/hadoop dfs -put /local/x /remote/tmp/ fails,
with all blocks being bad. This is not surprising since the file is
copied entirely to the HDFS region that resides on A. Had the file
been copied across all machines, this would not have failed.
I have more experience with mapreduce and not much with the hdfs side
Is there a configuration option i'm missing that forces the file to be
split across the machines(when it is being copied)?
Saptarshi Guha - [EMAIL PROTECTED]