Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> Too large class path for map reduce jobs

Copy link to this message
Re: Too large class path for map reduce jobs
Hi Henning,

I don't know if you've seen
https://issues.apache.org/jira/browse/MAPREDUCE-1938 and
https://issues.apache.org/jira/browse/MAPREDUCE-1700 which have
discussion about this issue.


On Fri, Sep 24, 2010 at 3:41 AM, Henning Blohm <[EMAIL PROTECTED]> wrote:
> Short update on the issue:
> I tried to find a way to separate class path configurations by modifying the
> scripts in HADOOP_HOME/bin but found that TaskRunner actually copies the
> class path setting from the parent process when starting a local task so
> that I do not see a way of having less on a job's classpath without
> modifying Hadoop.
> As that will present a real issue when running our jobs on Hadoop I would
> like to propose to change TaskRunner so that it sets a class path
> specifically for M/R tasks. That class path could be defined in the scipts
> (as for the other processes) using a particular environment variable (e.g.
> HADOOP_JOB_CLASSPATH). It could default to the current VM's class path,
> preserving today's behavior.
> Is it ok to enter this as an issue?
> Thanks,
>   Henning
> Am Freitag, den 17.09.2010, 16:01 +0000 schrieb Allen Wittenauer:
> On Sep 17, 2010, at 4:56 AM, Henning Blohm wrote:
>> When running map reduce tasks in Hadoop I run into classpath issues.
>> Contrary to previous posts, my problem is not that I am missing classes on
>> the Task's class path (we have a perfect solution for that) but rather find
>> too many (e.g. ECJ classes or jetty).
> The fact that you mention:
>> The libs in HADOOP_HOME/lib seem to contain everything needed to run
>> anything in Hadoop which is, I assume, much more than is needed to run a map
>> reduce task.
> hints that your perfect solution is to throw all your custom stuff in lib.
> If so, that's a huge mistake.  Use distributed cache instead.