I am starting the update from hadoop 0.20 to a newer version which changes
HDFS format(2.0). I read a lot of tutorials and they say that data loss is
possible (as expected). In order to avoid HDFS data loss I am will probably
backup all HDFS structure (7TB per node). However, this is a huge amount
of data and it will take a lot of time in which my service would be
I was thinking about a simple approach: copying all files to a different
I tried to find some parallel files compactor to fasten the process, but
not find it.
How do you guys did it?
Is there some trick?
Thank you in advance,