|
|
+
Viral Bajaria 2013-01-07, 11:46
-
hadoop 0.23.5 -files and -libjarsViral Bajaria 2013-01-07, 11:42
Hi,
I have been trying to play around with the hadoop jar command in 0.23.5 and hive 0.9.0 wanted to run a custom mapreduce job using: hadoop jar <jar> <main-class> -libjars "comma-separated list of files" -files "comma-separated list of files" Both libjars and files have the same files specified. The first problem is, when I used GenericOptionsParser it is setting the "mapreduce.client.genericoptionsparser.used" to true but it is not populating the "tmpjars" and "tmpfiles" configuration properties. Not exactly sure why. Any ideas ? I then looked at the GenericOptionsParser code in github and pulled out the relevant pieces into my Driver class and the exact same piece of code adds the "tmpjars" and "tmpfiles" to the job.xml, why this works while invoking GenericOptionsParser does not is something that I am not sure. I can see the -files in the filecache and I printed out the classpath in the Mapper setup method and I can see it listed as file:/{filepath}. I checked that path and it exists and is accessible to that user. I did a getClassByName on a class in that jar but I keep on getting ClassNotFound exception. Any reason why this would happen ? If the file exists in the classpath I would assume either Class.forName or context.GetConfiguration().getClassByName to work, but both don't work. I am running a single-node cluster for all the experimentation but don't want to explicitly add all the jars to hadoop-env or yarn-env. Any help will be appreciated. Thanks, Viral |