Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Sqoop >> mail # dev >> Review Request 15639: SQOOP-1236 Sqoop2: Sqoop2: Classpath generated by Submission enginue should contain only unique elements


Copy link to this message
-
Re: Review Request 15639: SQOOP-1236 Sqoop2: Sqoop2: Classpath generated by Submission enginue should contain only unique elements


> On Nov. 18, 2013, 1:21 p.m., Abraham Elmahrek wrote:
> > Only one edge to discuss really... see comment below!

+1 from me Jarcec!
> On Nov. 18, 2013, 1:21 p.m., Abraham Elmahrek wrote:
> > core/src/main/java/org/apache/sqoop/framework/SubmissionRequest.java, line 161
> > <https://reviews.apache.org/r/15639/diff/1/?file=387398#file387398line161>
> >
> >     The only problem I see here is that the order that jars are included could make the wrong class be used (or wrong version). Maybe instead of not appending the jar, we can remove the jar from the list and append it to the end of the list?
>
> Jarek Cecho wrote:
>     That is an excellent point Abe, thank you for taking your time to express the concern! Indeed current and even the original solution would not work correctly when the same dependency would be on class path twice (in different versions). Thinking about the problem, since we have only one class loader available, I'm afraid that there is not much we can do about that right now. I can see roughly two different approaches:
>    
>     1) First added to the class path wins (current approach)
>     2) Last added to the class path wins (the suggestion)
>    
>     Considering that Sqoop itself is using this mechanism to put framework dependencies, I'm inclining to the option 1) as this will guarantee that our libraries will be available in expected versions and only the third party code will break (connectors, ...). The benefit of this seems to be that any exception about incorrect dependency will be thrown inside the third party code and not from within the Sqoop code, which might make the debugging simpler. What do you think?

If a jar is in the classpath twice and the first one will be used by the JVM, then option 1 is clearly the better option. Great stuff!
- Abraham
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15639/#review29040
-----------------------------------------------------------
On Nov. 18, 2013, 4:04 p.m., Jarek Cecho wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15639/
> -----------------------------------------------------------
>
> (Updated Nov. 18, 2013, 4:04 p.m.)
>
>
> Review request for Sqoop.
>
>
> Bugs: SQOOP-1236
>     https://issues.apache.org/jira/browse/SQOOP-1236
>
>
> Repository: sqoop-sqoop2
>
>
> Description
> -------
>
> Added code that will ensure that only unique entries will be present in the classpath. I've also used StringUtils.join() to merge all the entries for use (instead of home made version of the same).
>
>
> Diffs
> -----
>
>   core/src/main/java/org/apache/sqoop/framework/SubmissionRequest.java 53d003980a172e4e0acf18630a3496909c17cb5c
>   core/src/test/java/org/apache/sqoop/framework/TestSubmissionRequest.java PRE-CREATION
>   submission/mapreduce/src/main/java/org/apache/sqoop/submission/mapreduce/MapreduceSubmissionEngine.java 6fc485b66ea8c33d09d172675a104954d570f3a7
>
> Diff: https://reviews.apache.org/r/15639/diff/
>
>
> Testing
> -------
>
> Unit tests seems to be passing.
>
>
> Thanks,
>
> Jarek Cecho
>
>