Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
MapReduce >> mail # user >> How to work around MAPREDUCE-1700


Copy link to this message
-
How to work around MAPREDUCE-1700
Anyone have any ideas how I might be able to work around
https://issues.apache.org/jira/browse/MAPREDUCE-1700 ?  It's quite a
thorny issue!

I have a M/R job that's using Avro (v1.3.3).  Avro, in turn, has a
dependency on Jackson (of which I'm using v1.5.4).  I'm able to add the
jars to the distributed cache fine, and my Mapper starts to run and load
Avro ... and then blammo:  "Error:
org.codehaus.jackson.JsonFactory.enable(Lorg/codehaus/jackson/JsonParser$Feature;)Lorg/codehaus/jackson/JsonFactory;"

The problem is that there's already an older (and obviously
incompatible) version of Jackson (v1.0.1) that's already included in the
Hadoop distribution.  And since that appears earlier on the classpath
than my Jackson jars, I get the error.

There doesn't seem to be any elegant solution to this.  I can't
downgrade to an earlier version of Avro, as my code relies on features
in the newer version.  And there doesn't seem to be any way
configuration-wise to solve this either (i.e., tell Hadoop to use the
newer Jackson jars for my M/R job, or to add those jars earlier on the
classpath).

Near as I can tell, the only solutions involve doing a hack on each of
my slave nodes.  I.e., either:

a) removing the existing jackson jars on each slave.  (Since I have no
need for the Hadoop feature that requires that Jackson.)

b) putting my newer jackson jars onto each slave in a place where it
will be loaded before the older one (e.g.,
/usr/lib/hadoop-0.20/lib/aaaaa_jackson-core-asl-1.5.4.jar)

Either of these options is a bit of a hack - and error prone as well,
since my job tasks will fail on any node that doesn't have this hack
applied.
Is there any cleaner way to resolve this issue that I'm not seeing?

Thanks,

DR
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB