-Automatically Documenting Apache Hadoop Configuration
Praveen Sripati 2011-12-05, 16:44
Recently there was a query about the Hadoop framework being tolerant for
map/reduce task failure towards the job completion. And the solution was to
set the 'mapreduce.map.failures.maxpercent` and
'mapreduce.reduce.failures.maxpercent' properties. Although this feature
was introduced couple of years back, it was not documented. Had similar
experience with 0.23 release also.
It would be really good for Hadoop adoption to automatically dig and
document all the existing configurable properties in Hadoop and also to
identify newly added properties in a particular release during the build
processes. Documentation would also lead to fewer queries in the forums.
Cloudera has done something similar , though it's not 100% accurate, it
would definitely help to some extent.
Can someone comment if Cloudera is willing to contribute this to Apache or
if someone else is working on something similar? If yes, it nice or else I
am willing to spend sometime on this activity. Would appreciate response.