Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # dev >> hi


Since we are developing at a very fast pace, it would be really useful to think about maintainability and testing of the large codebase.
Historically, we have not focussed on a few things, and they might soon bite us. I wanted to propose the following for all checkins:
  1.  Javadoc for all public/private functions, except for setters/getters. For any complex function, clear examples (input/output) would really help.
  2.  Convention for variable/function names – do we have any ?
  3.  If possible, the test name (.q file) where the function is being invoked, or the query which would potentially test that scenario, if it is a query processor change.
  4.  Specially, for query optimizations, it might be a good idea to have a simple working query at the top, and the expected changes. For e.g.. The operator tree for that query at each step, or a detailed explanation at the top.
  5.  Comments in each test (.q file)– that should include the jira number,  what is it trying to test. Assumptions about each query.
  6.  Reduce the output for each test – whenever query is outputting more than 10 results, it should have a reason. Otherwise, each query result should be bounded by 10 rows.

In general, focussing on a lot of comments in the code will go a long way for everyone to follow along.