Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
MapReduce, mail # user - Too large class path for map reduce jobs


+
Henning Blohm 2010-09-17, 11:56
+
Allen Wittenauer 2010-09-17, 16:01
+
Henning Blohm 2010-09-17, 18:53
+
Henning Blohm 2010-09-24, 10:41
Copy link to this message
-
Re: Too large class path for map reduce jobs
Tom White 2010-10-05, 22:59
Hi Henning,

I don't know if you've seen
https://issues.apache.org/jira/browse/MAPREDUCE-1938 and
https://issues.apache.org/jira/browse/MAPREDUCE-1700 which have
discussion about this issue.

Cheers
Tom

On Fri, Sep 24, 2010 at 3:41 AM, Henning Blohm <[EMAIL PROTECTED]> wrote:
> Short update on the issue:
>
> I tried to find a way to separate class path configurations by modifying the
> scripts in HADOOP_HOME/bin but found that TaskRunner actually copies the
> class path setting from the parent process when starting a local task so
> that I do not see a way of having less on a job's classpath without
> modifying Hadoop.
>
> As that will present a real issue when running our jobs on Hadoop I would
> like to propose to change TaskRunner so that it sets a class path
> specifically for M/R tasks. That class path could be defined in the scipts
> (as for the other processes) using a particular environment variable (e.g.
> HADOOP_JOB_CLASSPATH). It could default to the current VM's class path,
> preserving today's behavior.
>
> Is it ok to enter this as an issue?
>
> Thanks,
>   Henning
>
>
> Am Freitag, den 17.09.2010, 16:01 +0000 schrieb Allen Wittenauer:
>
> On Sep 17, 2010, at 4:56 AM, Henning Blohm wrote:
>
>> When running map reduce tasks in Hadoop I run into classpath issues.
>> Contrary to previous posts, my problem is not that I am missing classes on
>> the Task's class path (we have a perfect solution for that) but rather find
>> too many (e.g. ECJ classes or jetty).
>
> The fact that you mention:
>
>> The libs in HADOOP_HOME/lib seem to contain everything needed to run
>> anything in Hadoop which is, I assume, much more than is needed to run a map
>> reduce task.
>
> hints that your perfect solution is to throw all your custom stuff in lib.
> If so, that's a huge mistake.  Use distributed cache instead.
>
+
Henning Blohm 2010-10-06, 09:57
+
Alejandro Abdelnur 2010-10-06, 10:28
+
Henning Blohm 2010-10-06, 11:57
+
Alejandro Abdelnur 2010-10-07, 05:02
+
Alejandro Abdelnur 2010-10-07, 05:22
+
Henning Blohm 2010-10-07, 07:43
+
Alejandro Abdelnur 2010-10-07, 08:23
+
Tom White 2010-10-07, 20:27
+
Henning Blohm 2010-10-08, 07:52