While working on the release, faced few issues w.r.t. License.txt:
1. I found that we do not record the license of the third-party libs
anywhere in the repository. There should be some place where we must
add the licenses that are redistributed irrespective of the distro
package. This is because - the developer adding the JAR is the best
person to include the License and; we don't miss on including License
for any JARS being shipped.
2. I found that there is a need to manually change the License files
while creating the release artifacts - license files being different
for all 5 packages we ship(src, all bin packages acc. to hadoop
version - 20,23,1.x,2.x).
3. With so many distro packages it is a confusion as to what all
licenses should be present in the LICENSE.txt present in trunk.
The motive of this mail is to discuss on the points as to how we can
maintain and automate the Licenses.
Jarcec has already filed a JIRA for this purpose -
https://issues.apache.org/jira/browse/SQOOP-437. We can consider the
discussion as a part of this JIRA.
Just to start off:
a. For #1 above I think we can have a single file containing all the
licenses(or one file per license/jar) irrespective of the distro, devs
to keep on appending/removing licenses in this(these) file(s).
b. For #2, we will be able to automate the license file generation
process - if we have all the licenses somewhere in trunk. Suggestions
would be appreciated on how we can automate this generation using any
framework - last/backup approach being adhoc script.
c. For #3, we can take following approaches:
i. All the licenses to be kept in the LICENSE.txt present in trunk.
I personally vote for this because it will also solve the problem of
maintaining all the licenses - mentioned in #1 above.
ii. Keep licenses of jars to be included in src package only.
iii. Keep the license file as it is now.
Let me know your suggestions. It would appreciated if anyone can
suggest any other options that we can take.