Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Sqoop >> mail # dev >> Review Request: SQOOP-931 - Integration of Sqoop and HCatalog

Copy link to this message
Re: Review Request: SQOOP-931 - Integration of Sqoop and HCatalog

This is an automatically generated e-mail. To reply, visit:
Hi Venkat,
thank you very much for incorporating all my suggestions. I've took a deeper look and the changes seems great to me. I do have couple of high level notes:

1) Tests for profile 200 and 23 are failing on obligate "java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected". I was looking into maven central and it seems that only hadoop 1.x jars were published for HCatalog 0.11.0. Do you think that we can ask the HCatalog/Hive team to also publish Hadoop 2 compatible jars (via different classifier for example).

2) Would you mind updating user guide?

3) I'll update the old-pom.xml file after this will get in as I'm using it for bootstrapping my IntelliJ project and we're missing the new dependencies on HCatalog.

I'm still missing running the patch on a real cluster, but otherwise I feel that we are very close to get it in!


    Nit: It seems that the SqoopOptions class is preferring get methods returning boolean to be called with either "is" or "do", e.g. something like isCreateHCatalogTable().


    Introducing this property is awesome idea, I would suggest to also use it across entire code base (for example in JobBase class). Considering the size of this patch already, I'm completely fine with doing that in follow up JIRA.


    Nit: The indent seems to be off.


    Nit: Can we throw here IllegalArgumentException similarly as in case of toAvroType() method? Dying fast seems to me as a better option that getting NPE somewhere later. I know that the toJavaType() is returning null as well at the moment, but  we can "fix" it in follow up JIRA.


    I can see the same variable isHCatJob in ImportJobBase and ExportJobBase. Do you think that it would make sense to put it into JobBase class instead?


    Nit: Do you think that it would make sense to introduce a new FileType "HCATALOG"?


    Is this method necessary? It seems to be only calling the parent method without any additional logic.


    The "home" variable seems to be unused after the assignment, so I'm assuming that assignements are not necessary?


    I do understand that Hive/HCatalog requires to have lowercase table names, but I'm a bit concerned about doing it without user knowledge. Do you think that it would make sense to detect if user specified uppercase letters and print out a warning?


    Nit: This seems more debug line for me, what do you think?


    Nit: This seems to be doing the same as SqoopOptions.getHCatHomeDefault().


    It seems to me that the --as-avrodatafile and --as-sequencefile are also not compatible with the HCatalog import/export so we might add the validations here.

- Jarek Cecho
On May 24, 2013, 11:18 p.m., Venkat Ranganathan wrote: