|
|
-
hadoop JARs not in lib/ directory of layout
Alejandro Abdelnur 2011-08-04, 18:21
[Using the core-dev@ alias now]
---------- Forwarded message ---------- From: Alejandro Abdelnur <[EMAIL PROTECTED]> Date: Thu, Aug 4, 2011 at 11:03 AM Subject: hadoop JARs not in lib/ directory of layout To: [EMAIL PROTECTED]
What is the rationale for having the hadoop JARs outside of the lib/ directory?
It would definitely simplify packaging configuration if they are under lib/ as well.
Any objection to it?
Thanks.
Alejandro
-
Re: hadoop JARs not in lib/ directory of layout
Alejandro Abdelnur 2011-08-04, 20:06
[moving to core-dev@]
A big release note is doable.
Still, people normally use 'hadoop' script when submitting jobs and 'hadoop' would take care of having the JAR in the classpath. What other things would break?
I'll start working on a patch then.
Thxs.
Alejandro
On Thu, Aug 4, 2011 at 12:52 PM, Allen Wittenauer <[EMAIL PROTECTED]> wrote:
> > On Aug 4, 2011, at 11:03 AM, Alejandro Abdelnur wrote: > > > What is the rationale for having the hadoop JARs outside of the lib/ > > directory? > > > > It would definitely simplify packaging configuration if they are under > lib/ > > as well. > > > > Any objection to it? > > > It needs a big release note as this will break users in bad bad > ways. > >
-
Re: hadoop JARs not in lib/ directory of layout
Allen Wittenauer 2011-08-04, 20:54
On Aug 4, 2011, at 1:06 PM, Alejandro Abdelnur wrote:
> [moving to core-dev@] > > A big release note is doable. > > Still, people normally use 'hadoop' script when submitting jobs and 'hadoop' > would take care of having the JAR in the classpath. What other things would > break? >
Everyone using streaming, for starters. I suspect lots of pig, hive, and hbase installations will also break.
-
Re: hadoop JARs not in lib/ directory of layout
Alejandro Abdelnur 2011-08-04, 20:59
Allen,
Pig, Hive bundle Hadoop JARs with distributions, so no issue there.
And for those who use Hadoop JARs from HADOOP_HOME, they normally add all dirs under ${HADOOP_HOME}/lib
And finally, with the new layout of Hadoop trunk, things will be broken anyhow for them, so work will have to be done there.
Thanks.
Alejandro
On Thu, Aug 4, 2011 at 1:54 PM, Allen Wittenauer <[EMAIL PROTECTED]> wrote:
> > On Aug 4, 2011, at 1:06 PM, Alejandro Abdelnur wrote: > > > [moving to core-dev@] > > > > A big release note is doable. > > > > Still, people normally use 'hadoop' script when submitting jobs and > 'hadoop' > > would take care of having the JAR in the classpath. What other things > would > > break? > > > > Everyone using streaming, for starters. I suspect lots of pig, > hive, and hbase installations will also break. > >
-
Re: hadoop JARs not in lib/ directory of layout
Allen Wittenauer 2011-08-04, 21:12
On Aug 4, 2011, at 1:59 PM, Alejandro Abdelnur wrote: > Pig, Hive bundle Hadoop JARs with distributions, so no issue there.
Re-read what I said:
>> I suspect lots of pig, hive, and hbase installations will also break.
It still remains a potential issue for those of us who build our own and don't like to have multiple copies of the same jar floating around.
> And finally, with the new layout of Hadoop trunk, things will be broken > anyhow for them, so work will have to be done there.
That may be, but it is still important to point this change out. Otherwise all the release notes would just be "some stuff changed! have fun finding it all!"
No release note == -1 from me.
-
Re: hadoop JARs not in lib/ directory of layout
Alejandro Abdelnur 2011-08-04, 21:23
Allen,
I agree 100% with you, there will be setups that will break. I was just pointing out that it may be a bit less that expected because of the scenarios I described. But yes, you are right, something/somewhere will break.
And being more clear than before, yes, it MUST GO in the release notes.
Thanks.
Alejandro
On Thu, Aug 4, 2011 at 2:12 PM, Allen Wittenauer <[EMAIL PROTECTED]> wrote:
> > On Aug 4, 2011, at 1:59 PM, Alejandro Abdelnur wrote: > > Pig, Hive bundle Hadoop JARs with distributions, so no issue there. > > Re-read what I said: > > >> I suspect lots of pig, hive, and hbase installations will also > break. > > It still remains a potential issue for those of us who build our > own and don't like to have multiple copies of the same jar floating around. > > > And finally, with the new layout of Hadoop trunk, things will be broken > > anyhow for them, so work will have to be done there. > > That may be, but it is still important to point this change out. > Otherwise all the release notes would just be "some stuff changed! have fun > finding it all!" > > No release note == -1 from me. > > >
-
Re: hadoop JARs not in lib/ directory of layout
Arindam Khaled 2011-08-05, 22:06
Please unsubscribe me. Thanks.
Kind regards,
Arindam Khaled
On Aug 4, 2011, at 4:23 PM, Alejandro Abdelnur wrote:
> Allen, > > I agree 100% with you, there will be setups that will break. I was > just > pointing out that it may be a bit less that expected because of the > scenarios I described. But yes, you are right, something/somewhere > will > break. > > And being more clear than before, yes, it MUST GO in the release > notes. > > Thanks. > > Alejandro > > On Thu, Aug 4, 2011 at 2:12 PM, Allen Wittenauer <[EMAIL PROTECTED]> > wrote: > >> >> On Aug 4, 2011, at 1:59 PM, Alejandro Abdelnur wrote: >>> Pig, Hive bundle Hadoop JARs with distributions, so no issue there. >> >> Re-read what I said: >> >>>> I suspect lots of pig, hive, and hbase installations will also >> break. >> >> It still remains a potential issue for those of us who build >> our >> own and don't like to have multiple copies of the same jar floating >> around. >> >>> And finally, with the new layout of Hadoop trunk, things will be >>> broken >>> anyhow for them, so work will have to be done there. >> >> That may be, but it is still important to point this change >> out. >> Otherwise all the release notes would just be "some stuff changed! >> have fun >> finding it all!" >> >> No release note == -1 from me. >> >> >>
-
Re: hadoop JARs not in lib/ directory of layout
Alejandro Abdelnur 2011-08-10, 18:10
Eric,
I'd argue that including the JAR as you suggest will most likely break because of required dependencies of the Hadoop JAR that may not be part of HBase (ie the jackson JARs).
But if you want to still do that you can always include the jar from the lib directory, for example:
$HBASE_PREFIX/share/hbase/hbase*.jar:$HBASE_PREFIX/share/hbase/lib/*.jar:$HADOOP_PREFIX/share/hadoop/ *lib/*hadoop-*.jar
Thoughts?
Thanks.
Alejandro
On Thu, Aug 4, 2011 at 3:47 PM, Eric Yang <[EMAIL PROTECTED]> wrote:
> It is easier to write shell script to import jar files by directory instead > of explicitly reference to a few jars with specific versions. > > The common use case is: > > HBase needs to use hadoop jar files, but HBase depends on more recent > version of log4j. The construction of the class path would be: > > > $HBASE_PREFIX/share/hbase/hbase*.jar:$HBASE_PREFIX/share/hbase/lib/*.jar:$HADOOP_PREFIX/share/hadoop/*.jar > > This provides a way to segment the library loading with least amount of > scripting and loosely coupled. > > regards, > Eric > > On Aug 4, 2011, at 2:40 PM, Alejandro Abdelnur wrote: > > > [moving to core-dev@, general@ BCCed] > > > > Eric, > > > > Even if the JAR is in lib/ you could import/use that JAR only. > > > > How would you use Hadoop JARs without its dependencies? Many things will > > break unless you add the dependency JARs. > > > > Granted, there are JARs that are used by Hadoop server side only > > (JT/NN/TT/DN/SNN), but that is a different thing. Having a client side > set > > of JARs would help handle this (MAPREDUCE-1638). > > > > Thoughts? > > > > Thanks. > > > > Alejandro > > > > On Thu, Aug 4, 2011 at 1:14 PM, Eric Yang <[EMAIL PROTECTED]> wrote: > > > >> The jar files placement outside of lib directory is to ensure the > project > >> generated jar files are not mixed with it's dependencies. > >> Hence, if another project tries to import current project's jar files > >> without dependencies, it is possible to do so. > >> > >> regards, > >> Eric > >> > >> On Aug 4, 2011, at 11:03 AM, Alejandro Abdelnur wrote: > >> > >>> What is the rationale for having the hadoop JARs outside of the lib/ > >>> directory? > >>> > >>> It would definitely simplify packaging configuration if they are under > >> lib/ > >>> as well. > >>> > >>> Any objection to it? > >>> > >>> Thanks. > >>> > >>> Alejandro > >> > >> > >
-
Re: hadoop JARs not in lib/ directory of layout
Eric Yang 2011-08-10, 18:40
On Aug 10, 2011, at 11:10 AM, Alejandro Abdelnur wrote:
> Eric, > > I'd argue that including the JAR as you suggest will most likely break > because of required dependencies of the Hadoop JAR that may not be part of > HBase (ie the jackson JARs). > > But if you want to still do that you can always include the jar from the lib > directory, for example: > > $HBASE_PREFIX/share/hbase/hbase*.jar:$HBASE_PREFIX/share/hbase/lib/*.jar:$HADOOP_PREFIX/share/hadoop/ > *lib/*hadoop-*.jar
This example shows HBase jar file and Hadoop jar file are located in different directory. This seems inconsistent. The settle difference of having $project*.jar in share/$project is good to have. I don't have strong opinion if they are merged into one directory, but they should be consistent across projects. You might want to get buy in from the community before making this change. Owen was in favor of having third party libraries in separated directories, and I am in favor of his design.
regards, Eric
> Thoughts? > > Thanks. > > Alejandro > > On Thu, Aug 4, 2011 at 3:47 PM, Eric Yang <[EMAIL PROTECTED]> wrote: > >> It is easier to write shell script to import jar files by directory instead >> of explicitly reference to a few jars with specific versions. >> >> The common use case is: >> >> HBase needs to use hadoop jar files, but HBase depends on more recent >> version of log4j. The construction of the class path would be: >> >> >> $HBASE_PREFIX/share/hbase/hbase*.jar:$HBASE_PREFIX/share/hbase/lib/*.jar:$HADOOP_PREFIX/share/hadoop/*.jar >> >> This provides a way to segment the library loading with least amount of >> scripting and loosely coupled. >> >> regards, >> Eric >> >> On Aug 4, 2011, at 2:40 PM, Alejandro Abdelnur wrote: >> >>> [moving to core-dev@, general@ BCCed] >>> >>> Eric, >>> >>> Even if the JAR is in lib/ you could import/use that JAR only. >>> >>> How would you use Hadoop JARs without its dependencies? Many things will >>> break unless you add the dependency JARs. >>> >>> Granted, there are JARs that are used by Hadoop server side only >>> (JT/NN/TT/DN/SNN), but that is a different thing. Having a client side >> set >>> of JARs would help handle this (MAPREDUCE-1638). >>> >>> Thoughts? >>> >>> Thanks. >>> >>> Alejandro >>> >>> On Thu, Aug 4, 2011 at 1:14 PM, Eric Yang <[EMAIL PROTECTED]> wrote: >>> >>>> The jar files placement outside of lib directory is to ensure the >> project >>>> generated jar files are not mixed with it's dependencies. >>>> Hence, if another project tries to import current project's jar files >>>> without dependencies, it is possible to do so. >>>> >>>> regards, >>>> Eric >>>> >>>> On Aug 4, 2011, at 11:03 AM, Alejandro Abdelnur wrote: >>>> >>>>> What is the rationale for having the hadoop JARs outside of the lib/ >>>>> directory? >>>>> >>>>> It would definitely simplify packaging configuration if they are under >>>> lib/ >>>>> as well. >>>>> >>>>> Any objection to it? >>>>> >>>>> Thanks. >>>>> >>>>> Alejandro >>>> >>>> >> >>
-
Re: hadoop JARs not in lib/ directory of layout
Alejandro Abdelnur 2011-08-10, 20:28
Eric,
I've just copied and pasted the example you written in your previous answer and added 'lib/' to it.
My concern with the current approach is that complicates the packaging logic requiring special handling of artifacts (JARs) and it complicates the products scripts.
Thanks.
Alejandro On Wed, Aug 10, 2011 at 11:40 AM, Eric Yang <[EMAIL PROTECTED]> wrote:
> On Aug 10, 2011, at 11:10 AM, Alejandro Abdelnur wrote: > > > Eric, > > > > I'd argue that including the JAR as you suggest will most likely break > > because of required dependencies of the Hadoop JAR that may not be part > of > > HBase (ie the jackson JARs). > > > > But if you want to still do that you can always include the jar from the > lib > > directory, for example: > > > > > $HBASE_PREFIX/share/hbase/hbase*.jar:$HBASE_PREFIX/share/hbase/lib/*.jar:$HADOOP_PREFIX/share/hadoop/ > > *lib/*hadoop-*.jar > > This example shows HBase jar file and Hadoop jar file are located in > different directory. This seems inconsistent. The settle difference of > having $project*.jar in share/$project is good to have. I don't have strong > opinion if they are merged into one directory, but they should be consistent > across projects. You might want to get buy in from the community before > making this change. Owen was in favor of having third party libraries in > separated directories, and I am in favor of his design. > > regards, > Eric > > > Thoughts? > > > > Thanks. > > > > Alejandro > > > > On Thu, Aug 4, 2011 at 3:47 PM, Eric Yang <[EMAIL PROTECTED]> wrote: > > > >> It is easier to write shell script to import jar files by directory > instead > >> of explicitly reference to a few jars with specific versions. > >> > >> The common use case is: > >> > >> HBase needs to use hadoop jar files, but HBase depends on more recent > >> version of log4j. The construction of the class path would be: > >> > >> > >> > $HBASE_PREFIX/share/hbase/hbase*.jar:$HBASE_PREFIX/share/hbase/lib/*.jar:$HADOOP_PREFIX/share/hadoop/*.jar > >> > >> This provides a way to segment the library loading with least amount of > >> scripting and loosely coupled. > >> > >> regards, > >> Eric > >> > >> On Aug 4, 2011, at 2:40 PM, Alejandro Abdelnur wrote: > >> > >>> [moving to core-dev@, general@ BCCed] > >>> > >>> Eric, > >>> > >>> Even if the JAR is in lib/ you could import/use that JAR only. > >>> > >>> How would you use Hadoop JARs without its dependencies? Many things > will > >>> break unless you add the dependency JARs. > >>> > >>> Granted, there are JARs that are used by Hadoop server side only > >>> (JT/NN/TT/DN/SNN), but that is a different thing. Having a client side > >> set > >>> of JARs would help handle this (MAPREDUCE-1638). > >>> > >>> Thoughts? > >>> > >>> Thanks. > >>> > >>> Alejandro > >>> > >>> On Thu, Aug 4, 2011 at 1:14 PM, Eric Yang <[EMAIL PROTECTED]> wrote: > >>> > >>>> The jar files placement outside of lib directory is to ensure the > >> project > >>>> generated jar files are not mixed with it's dependencies. > >>>> Hence, if another project tries to import current project's jar files > >>>> without dependencies, it is possible to do so. > >>>> > >>>> regards, > >>>> Eric > >>>> > >>>> On Aug 4, 2011, at 11:03 AM, Alejandro Abdelnur wrote: > >>>> > >>>>> What is the rationale for having the hadoop JARs outside of the lib/ > >>>>> directory? > >>>>> > >>>>> It would definitely simplify packaging configuration if they are > under > >>>> lib/ > >>>>> as well. > >>>>> > >>>>> Any objection to it? > >>>>> > >>>>> Thanks. > >>>>> > >>>>> Alejandro > >>>> > >>>> > >> > >> > >
|
|