|
|
-
pig using zebra, ClassNotFoundException on TableOutputFormat
Bennie Schut 2009-11-13, 11:02
I'm looking into improving the performance of one of my pig jobs. I figured storing the data which I keep reusing in a binary/serialized format could help me a little with this and thus stumbled upon zebra. It seems like a nice abstraction and seems to do exactly what I want to achieve.
I started with something simple but that doesn't work.
register zebra-0.6.0-dev.jar; dim_calendar = load '/user/dwh/dim/calendar.csv' using PigStorage('\t') as (cldr_id: long, iso_date: chararray); outfile = order dim_calendar by iso_date parallel 1; store outfile into '/user/dwh/calendar.zebra' using org.apache.hadoop.zebra.pig.TableStorer('cldr_id: long, iso_date:string');
On running this I get: --------------- ERROR 2117: Unexpected error when launching map reduce job.
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to store alias 97 at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1003) at org.apache.pig.PigServer.registerQuery(PigServer.java:385) at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:720) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:324) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:168) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:144) at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75) at org.apache.pig.Main.main(Main.java:352) Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2117: Unexpected error when launching map reduce job. at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:194) at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:249) at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:780) at org.apache.pig.PigServer.execute(PigServer.java:773) at org.apache.pig.PigServer.access$100(PigServer.java:89) at org.apache.pig.PigServer$Graph.execute(PigServer.java:951) at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:998) ... 7 more Caused by: java.lang.RuntimeException: Could not resolve error that occured when launching map reduce job: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.hadoop.zebra.pig.TableOutputFormat at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$JobControlThreadExceptionHandler.uncaughtException(MapReduceLauncher.java:428) at java.lang.Thread.dispatchUncaughtException(Thread.java:1831) -----
Any idea why? TableOutputFormat is an inner class of TableStorer so I'm a little puzzled how it could find one but not the other. fyi.. I'm using hadoop-0.20.1 and pig/zebra from trunk but haven't updated pig in a few weeks.
Thanks, Bennie.
-
RE: pig using zebra, ClassNotFoundException on TableOutputFormat
Santhosh Srinivasan 2009-11-13, 15:04
Bennie,
Include zebra-0.6.0-dev.jar in your classpath and then relaunch pig.
Santhosh
-----Original Message----- From: Bennie Schut [mailto:[EMAIL PROTECTED]] Sent: Friday, November 13, 2009 3:03 AM To: [EMAIL PROTECTED] Subject: pig using zebra, ClassNotFoundException on TableOutputFormat
I'm looking into improving the performance of one of my pig jobs. I figured storing the data which I keep reusing in a binary/serialized format could help me a little with this and thus stumbled upon zebra. It seems like a nice abstraction and seems to do exactly what I want to achieve.
I started with something simple but that doesn't work.
register zebra-0.6.0-dev.jar; dim_calendar = load '/user/dwh/dim/calendar.csv' using PigStorage('\t') as (cldr_id: long, iso_date: chararray); outfile = order dim_calendar by iso_date parallel 1; store outfile into '/user/dwh/calendar.zebra' using org.apache.hadoop.zebra.pig.TableStorer('cldr_id: long, iso_date:string');
On running this I get: --------------- ERROR 2117: Unexpected error when launching map reduce job.
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable to store alias 97 at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1003) at org.apache.pig.PigServer.registerQuery(PigServer.java:385) at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:720) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptPar ser.java:324) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java :168) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java :144) at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75) at org.apache.pig.Main.main(Main.java:352) Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 2117: Unexpected error when launching map reduce job. at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa uncher.launchPig(MapReduceLauncher.java:194) at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(H ExecutionEngine.java:249) at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:780) at org.apache.pig.PigServer.execute(PigServer.java:773) at org.apache.pig.PigServer.access$100(PigServer.java:89) at org.apache.pig.PigServer$Graph.execute(PigServer.java:951) at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:998) ... 7 more Caused by: java.lang.RuntimeException: Could not resolve error that occured when launching map reduce job: java.lang.RuntimeException: java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.hadoop.zebra.pig.TableOutputFormat at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa uncher$JobControlThreadExceptionHandler.uncaughtException(MapReduceLaunc her.java:428) at java.lang.Thread.dispatchUncaughtException(Thread.java:1831) -----
Any idea why? TableOutputFormat is an inner class of TableStorer so I'm a little puzzled how it could find one but not the other. fyi.. I'm using hadoop-0.20.1 and pig/zebra from trunk but haven't updated pig in a few weeks.
Thanks, Bennie.
-
Re: pig using zebra, ClassNotFoundException on TableOutputFormat
Bennie Schut 2009-11-16, 07:58
Ah thanks. Working like a charm now. Now I can play with the TableInserter part.
Santhosh Srinivasan wrote: > Bennie, > > Include zebra-0.6.0-dev.jar in your classpath and then relaunch pig. > > Santhosh > > -----Original Message----- > From: Bennie Schut [mailto:[EMAIL PROTECTED]] > Sent: Friday, November 13, 2009 3:03 AM > To: [EMAIL PROTECTED] > Subject: pig using zebra, ClassNotFoundException on TableOutputFormat > > I'm looking into improving the performance of one of my pig jobs. I > figured storing the data which I keep reusing in a binary/serialized > format could help me a little with this and thus stumbled upon zebra. > It seems like a nice abstraction and seems to do exactly what I want to > achieve. > > I started with something simple but that doesn't work. > > register zebra-0.6.0-dev.jar; > dim_calendar = load '/user/dwh/dim/calendar.csv' using PigStorage('\t') > as (cldr_id: long, iso_date: chararray); outfile = order dim_calendar by > iso_date parallel 1; store outfile into '/user/dwh/calendar.zebra' using > org.apache.hadoop.zebra.pig.TableStorer('cldr_id: long, > iso_date:string'); > > On running this I get: > --------------- > ERROR 2117: Unexpected error when launching map reduce job. > > org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable > to store alias 97 at > org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1003) > at org.apache.pig.PigServer.registerQuery(PigServer.java:385) > at > org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:720) > at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptPar > ser.java:324) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java > :168) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java > :144) > at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75) > at org.apache.pig.Main.main(Main.java:352) > Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR > 2117: Unexpected error when launching map reduce job. > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa > uncher.launchPig(MapReduceLauncher.java:194) > at > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(H > ExecutionEngine.java:249) > at > org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:780) > at org.apache.pig.PigServer.execute(PigServer.java:773) > at org.apache.pig.PigServer.access$100(PigServer.java:89) > at org.apache.pig.PigServer$Graph.execute(PigServer.java:951) > at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:998) > ... 7 more > Caused by: java.lang.RuntimeException: Could not resolve error that > occured when launching map reduce job: java.lang.RuntimeException: > java.lang.RuntimeException: java.lang.ClassNotFoundException: > org.apache.hadoop.zebra.pig.TableOutputFormat > at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa > uncher$JobControlThreadExceptionHandler.uncaughtException(MapReduceLaunc > her.java:428) > at java.lang.Thread.dispatchUncaughtException(Thread.java:1831) > ----- > > Any idea why? > TableOutputFormat is an inner class of TableStorer so I'm a little > puzzled how it could find one but not the other. > fyi.. I'm using hadoop-0.20.1 and pig/zebra from trunk but haven't > updated pig in a few weeks. > > Thanks, > Bennie. >
|
|
All projects made searchable here are trademarks of the Apache Software Foundation.
Service operated by
Sematext