|
|
-
TT nodes distributed cache failureTerry Healy 2013-01-25, 17:48
Running hadoop-0.20.2 on a 20 node cluster.
When running a Map/Reduce job that uses several .jars loaded into the Distributed cache, several (~4) nodes have their map jobs fails because of ClassNotFoundException. All the other nodes proceed through the job normally and the jobs completes. But this is wasting 20-25% of my TT nodes. Can anyone explain why some nodes might fail to read all the .jars from the Distributed cache? Thanks |