Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce >> mail # user >> Too large class path for map reduce jobs


+
Henning Blohm 2010-09-17, 11:56
+
Allen Wittenauer 2010-09-17, 16:01
+
Henning Blohm 2010-09-17, 18:53
+
Henning Blohm 2010-09-24, 10:41
+
Tom White 2010-10-05, 22:59
+
Henning Blohm 2010-10-06, 09:57
+
Alejandro Abdelnur 2010-10-06, 10:28
+
Henning Blohm 2010-10-06, 11:57
+
Alejandro Abdelnur 2010-10-07, 05:02
+
Alejandro Abdelnur 2010-10-07, 05:22
+
Henning Blohm 2010-10-07, 07:43
+
Alejandro Abdelnur 2010-10-07, 08:23
Copy link to this message
-
Re: Too large class path for map reduce jobs
I wonder if there is a misunderstanding here - the problem is that the
classpath has too many classes on it (and clashes with user classes),
rather than it being a text string which is too long.

I would suggest that the technical discussion of how to fix this goes
onto the JIRA.

Cheers,
Tom

On Thu, Oct 7, 2010 at 1:23 AM, Alejandro Abdelnur <[EMAIL PROTECTED]> wrote:
> well, if the issue is a too long classpath, the softlink thingy will give
> some room to breath as the total CP length will be much smaller.
>
> A
> On Thu, Oct 7, 2010 at 3:43 PM, Henning Blohm <[EMAIL PROTECTED]>
> wrote:
>>
>> So that's actually another issue, right? Besides splitting the classpath
>> into those three groups, you want the TT to create soft-links on demand to
>> simplify the computation of classpath string. Is that right?
>>
>> But it's the TT that actually starts the job VM. Why does it matter what
>> the string actually looks like, as long as it has the right content?
>>
>> Thanks,
>>   Henning
>>
>> On Thu, 2010-10-07 at 13:22 +0800, Alejandro Abdelnur wrote:
>>
>> [sent too soon]
>>
>> The first CP shown is how it is today the CP of a task. If we change it
>> pick up all the job JARs from the current dir, then the classpath will be
>> much shorter (second CP shown). We can easily achieve this by soft-linking
>> the job JARs in the work dir of the task.
>>
>> Alejandro
>>
>> On Thu, Oct 7, 2010 at 1:02 PM, Alejandro Abdelnur <[EMAIL PROTECTED]>
>> wrote:
>>
>> Fragmentation of Hadoop classpaths is another issue: hadoop should
>> differentiate the CP in 3:
>>
>> 1*client CP: what is needed to submit a job (only the nachos)
>>
>> 2*server CP (JT/NN/TT/DD): what is need to run the cluster (the whole
>> enchilada)
>>
>> 3*job CP: what is needed to run a job (some of the enchilada)
>>
>>
>> But i'm not trying to get into that here. What I'm suggesting is:
>>
>>
>>
>> -----
>>
>> # Hadoop JARs:
>>
>> /Users/tucu/dev-apps/hadoop/conf
>>
>>
>> /System/Library/Frameworks/JavaVM.framework/Versions/1.6/Home/lib/tools.jar
>>
>> /Users/tucu/dev-apps/hadoop/bin/..
>>
>> /Users/tucu/dev-apps/hadoop/bin/../hadoop-core-0.20.3-CDH3-SNAPSHOT.jar
>>
>> /Users/tucu/dev-apps/hadoop/bin/../lib/aspectjrt-1.6.5.jar
>>
>> ..... (about 30 jars from hadoop lib/ )
>>
>> /Users/tucu/dev-apps/hadoop/bin/../lib/jsp-2.1/jsp-api-2.1.jar
>>
>> # Job JARs (for a job with only 2 JARs):
>>
>>
>> /Users/tucu/dev-apps/hadoop/dirs/mapred/taskTracker/distcache/-2707763075630339038_639898034_1993697040/localhost/user/tucu/oozie-tucu/0000003-101004184132247-oozie-tucu-W/java-node--java/java-launcher.jar
>>
>>
>> /Users/tucu/dev-apps/hadoop/dirs/mapred/taskTracker/distcache/3613772770922728555_-588832047_1993624983/localhost/user/tucu/examples/apps/java-main/lib/oozie-examples-2.2.1-CDH3B3-SNAPSHOT.jar
>>
>>
>> /Users/tucu/dev-apps/hadoop/dirs/mapred/taskTracker/tucu/jobcache/job_201010041326_0058/attempt_201010041326_0058_m_000000_0/work
>>
>> -----
>>
>>
>>
>> What I'm suggesting is that the later group, the job JARs to be
>> soft-linked (by the TT) into the working directory, then their classpath is
>> just:
>>
>> -----
>>
>> java-launcher.jar
>>
>> oozie-examples-2.2.1-CDH3B3-SNAPSHOT.jar
>>
>> .
>>
>> -----
>>
>>
>>
>>
>> Alejandro
>>
>> On Wed, Oct 6, 2010 at 7:57 PM, Henning Blohm <[EMAIL PROTECTED]>
>> wrote:
>>
>> Hi Alejandro,
>>
>>    yes, it can of course be done right (sorry if my wording seemed to
>> imply otherwise). Just saying that I think that Hadoop M/R should not go
>> into that class loader / module separation business. It's one Job, one VM,
>> right? So the problem is to assign just the stuff needed to let the Job do
>> its business without becoming an obstacle.
>>
>>   Must admit I didn't understand your proposal 2. How would that remove
>> (e.g.) jetty libs from the job's classpath?
>>
>> Thanks,
>>   Henning
>>
>> Am Mittwoch, den 06.10.2010, 18:28 +0800 schrieb Alejandro Abdelnur:
>>
>> 1. Classloader business can be done right. Actually it could be done as
+
Henning Blohm 2010-10-08, 07:52
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB