I noticed that the shuffle phase is reading data over http even when data
is available locally. The version of hadoop I'm using is 1.0.3. Is there a
reason it is implemented this way ? Is it OK to make a change that will
identify that the data is available locally and read from the local disk
instead of the http?
I'm new to this developer list and apache developer list in general. So
please feel free to let me know if there is a certain etiquette that I'm