Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Sqoop >> mail # dev >> Review Request: Request to review patch for SQOOP-954: Create Sqoop runtime scripts to run Sqoop on Windows

Copy link to this message
Re: Review Request: Request to review patch for SQOOP-954: Create Sqoop runtime scripts to run Sqoop on Windows

> On March 29, 2013, 7:03 p.m., Venkat Ranganathan wrote:
> > Hi Ahmed
> >
> > Thanks for the new patch.  It looks good.  I still have one issue and suggestion.  The powershell script to generate the jar file is very good!  You are generating a jar file everytime and the jar file is generated under SQOOP_HOME.   There may be installations for the SQOOP_HOME may not be writable by user.   Also, I think the main motivation is to overcome the environment strings limitation.   Since JDK 1.6, Java has the ability to provide an option to provide a shortcut for all jars in a file (This probably should be done for the Unix classpaths also).   Please see http://docs.oracle.com/javase/6/docs/technotes/tools/windows/classpath.html  
> >
> > I am thinking whether this should be a simpler change to just add all jars in SQOOP_LIB.  We have to say %SQOOP_HOME%\lib\*.   Of course, this introduces dependency on 1.6+ versions of JDK, but given that 1.5 is EOLed this should be OK
> >
> > Thanks
> Ahmed El Baz wrote:
>     Thank you a lot Venkat for the valuable comments,
>     I have considered the wildcard option, however, there are some limitations why it was not preferable to go this route, and using the referencing jar would give more flexibility:
>     1) The need to specify particular jars to include, or exclude some jars and not include all jars by default in a dorectory by using wildcard. For example, in configure-sqoop a list of dependency jars for HBase are returned by invoking "hbase classpath" which returns a list of jars. In this case using a wrapper Jar releases us from worrying about the length of jars returned, and it is not possible to use the * in this case, unless we do some logic to get common dirs.
>     2) As you can see also in configure-jar, Sqoop has dependency on other components rather than just SQOOP_HOME\lib, like HBase, SQOOP_CONF, ZOOCFGDIR.
>     3) Using the wrapper jar would scale regardless of how many directories we include. I understand it is hard the number of folders increases to the limit where we see the long command error, but even in this case the wrapper jar would work just fine.
>     I would like to unederstand more about scenarios where we anticipate SQOOP_HOME would not be writable on Windows systems.
>     Thank you again,
>     Ahmed
> Venkat Ranganathan wrote:
>     Thanks Ahmed for the explanation.
>     I thought we are primarily limited by the 8K limit in the command line so if we can potentially limit the large jar file dirs in this format, then it would be fit within the limit.
>     Good point of hbase -classpath option.  May be we can have improvement on Hbase to return the hbase classpath with jar dirs properly added
>     For example, when people install Hadoop on Windows and decide that Hadoop stack will be installed under a terminal server and this is shared across multiple users - or it  may be installed in a common location and mapped based on logon scripts.   And the directory can become inaccessible for people running sqoop jobs.   This is a scheme used by some  Hadoop distributions today.
>     Thanks
> Venkat Ranganathan wrote:
>     I had this comment written befoe, but got caught up in the saved reviews instead of publishing.  Sorry about that.   Can you check my comments and can we simplify this

Thank you Venkatesh,

I have update the patch to use the jar dirs for classpath locations, rather than the powershell script to generate a single jar encapsulating the classpath in its manifest. As discussed, we will need to have a corresponding change for the HBASE case where hbase.cmd -classpath is invoked to return a list of jar files. For now we use HBASE_HOME and HBASE_HOME\lib in the case of Windows.

- Ahmed
This is an automatically generated e-mail. To reply, visit:

On April 22, 2013, 3:26 a.m., Ahmed El Baz wrote: