Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop >> mail # user >> hadoop 0.23.5 -files and -libjars


Copy link to this message
-
hadoop 0.23.5 -files and -libjars
Hi,

I have been trying to play around with the hadoop jar command in 0.23.5 and
hive 0.9.0 wanted to run a custom mapreduce job using:

hadoop jar <jar> <main-class> -libjars "comma-separated list of files"
-files "comma-separated list of files"

Both libjars and files have the same files specified. The first problem is,
when I used GenericOptionsParser it is setting the "mapreduce.client.
genericoptionsparser.used" to true but it is not populating the "tmpjars"
and "tmpfiles" configuration properties. Not exactly sure why. Any ideas ?

I then looked at the GenericOptionsParser code in github and pulled out the
relevant pieces into my Driver class and the exact same piece of code adds
the "tmpjars" and "tmpfiles" to the job.xml, why this works while invoking
GenericOptionsParser does not is something that I am not sure.

I can see the -files in the filecache and I printed out the classpath in
the Mapper setup method and I can see it listed as file:/{filepath}. I
checked that path and it exists and is accessible to that user. I did a
getClassByName on a class in that jar but I keep on getting ClassNotFound
exception. Any reason why this would happen ? If the file exists in the
classpath I would assume either Class.forName or
context.GetConfiguration().getClassByName to work, but both don't work.

I am running a single-node cluster for all the experimentation but don't
want to explicitly add all the jars to hadoop-env or yarn-env.

Any help will be appreciated.

Thanks,
Viral
+
Viral Bajaria 2013-01-07, 11:42
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB