|
|
-
Pig + Hbase integration
Manu S 2012-10-25, 14:44
Hi,
I am using Pig-0.10.0 & hbase-0.94.2.
I am trying to store the processed output to Hbase cluster using pig script.
I registered the required .jar and set the mapreduce and zookeeper parameters within the script itself.
*# cat input.pig* register jar/hbase-0.94.2.jar; register jar/zookeeper-3.4.3.jar; register jar/protobuf-java-2.4.0a.jar; register jar/guava-11.0.2.jar; register jar/pig-0.10.0.jar;
set fs.default.name hdfs://namenode:8020; set mapred.job.tracker namenode:8021; set hbase.cluster.distributed true; set hbase.zookeeper.quorum namenode; set hbase.master namenode:60000; set hbase.zookeeper.property.clientPort 2181; * * *raw_data = LOAD 'sample_data.csv' USING PigStorage( ',' ) AS ( listing_id: chararray,fname: chararray,lname: chararray );* * * *STORE raw_data INTO 'hbase://inputcsv' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage ('info:fname info:lname');*
When I execute the script I am getting this error
*# pig input.pig* *2012-10-25 19:55:08,331 [main] INFO org.apache.pig.Main - Apache Pig version 0.10.0 (r1328203) compiled Apr 19 2012, 22:54:12* *2012-10-25 19:55:08,332 [main] INFO org.apache.pig.Main - Logging error messages to: /export/home/hadoop/devel/pig/pig_1351175108325.log* *2012-10-25 19:55:08,944 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://sangamt4:8020* *2012-10-25 19:55:09,172 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: sangamt4:8021* *2012-10-25 19:55:10,021 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2998: Unhandled internal error. org/apache/hadoop/hbase/filter/WritableByteArrayComparable* *Details at logfile: /export/home/hadoop/devel/pig/pig_1351175108325.log* Appreciate your help on this.
Thanks, Manu S
+
Manu S 2012-10-25, 14:44
-
Re: Pig + Hbase integration
Cheolsoo Park 2012-10-25, 16:49
Hi Manu,
Can you provide the output of 'cat /export/home/hadoop/devel/pig/pig_1351175108325.log' ?
Thanks, Cheolsoo
On Thu, Oct 25, 2012 at 7:44 AM, Manu S <[EMAIL PROTECTED]> wrote:
> Hi, > > I am using Pig-0.10.0 & hbase-0.94.2. > > I am trying to store the processed output to Hbase cluster using pig > script. > > I registered the required .jar and set the mapreduce and zookeeper > parameters within the script itself. > > *# cat input.pig* > register jar/hbase-0.94.2.jar; > register jar/zookeeper-3.4.3.jar; > register jar/protobuf-java-2.4.0a.jar; > register jar/guava-11.0.2.jar; > register jar/pig-0.10.0.jar; > > set fs.default.name hdfs://namenode:8020; > set mapred.job.tracker namenode:8021; > set hbase.cluster.distributed true; > set hbase.zookeeper.quorum namenode; > set hbase.master namenode:60000; > set hbase.zookeeper.property.clientPort 2181; > * > * > *raw_data = LOAD 'sample_data.csv' USING PigStorage( ',' ) AS ( > listing_id: chararray,fname: chararray,lname: chararray );* > * > * > *STORE raw_data INTO 'hbase://inputcsv' USING > org.apache.pig.backend.hadoop.hbase.HBaseStorage ('info:fname > info:lname');* > > When I execute the script I am getting this error > > *# pig input.pig* > *2012-10-25 19:55:08,331 [main] INFO org.apache.pig.Main - Apache Pig > version 0.10.0 (r1328203) compiled Apr 19 2012, 22:54:12* > *2012-10-25 19:55:08,332 [main] INFO org.apache.pig.Main - Logging error > messages to: /export/home/hadoop/devel/pig/pig_1351175108325.log* > *2012-10-25 19:55:08,944 [main] INFO > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - > Connecting to hadoop file system at: hdfs://sangamt4:8020* > *2012-10-25 19:55:09,172 [main] INFO > org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - > Connecting to map-reduce job tracker at: sangamt4:8021* > *2012-10-25 19:55:10,021 [main] ERROR org.apache.pig.tools.grunt.Grunt - > ERROR 2998: Unhandled internal error. > org/apache/hadoop/hbase/filter/WritableByteArrayComparable* > *Details at logfile: /export/home/hadoop/devel/pig/pig_1351175108325.log* > > > Appreciate your help on this. > > Thanks, > Manu S >
+
Cheolsoo Park 2012-10-25, 16:49
-
Re: Pig + Hbase integration
Manu S 2012-10-26, 03:57
Hi Cheolsoo,
Please find the log
On Thu, Oct 25, 2012 at 10:19 PM, Cheolsoo Park <[EMAIL PROTECTED]>wrote:
> Hi Manu, > > Can you provide the output of > 'cat /export/home/hadoop/devel/pig/pig_1351175108325.log' ? > >
*Pig Stack Trace* *---------------* *ERROR 2998: Unhandled internal error. org/apache/hadoop/hbase/filter/WritableByteArrayComparable* * * *java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/filter/WritableByteArrayComparable* * at java.lang.Class.forName0(Native Method)* * at java.lang.Class.forName(Class.java:247)* * at org.apache.pig.impl.PigContext.resolveClassName(PigContext.java:477)* * at org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:507)* * at org.apache.pig.parser.LogicalPlanBuilder.validateFuncSpec(LogicalPlanBuilder.java:791) * * at org.apache.pig.parser.LogicalPlanBuilder.buildFuncSpec(LogicalPlanBuilder.java:780) * * at org.apache.pig.parser.LogicalPlanGenerator.func_clause(LogicalPlanGenerator.java:4583) * * at org.apache.pig.parser.LogicalPlanGenerator.store_clause(LogicalPlanGenerator.java:6225) * * at org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1335) * * at org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:789) * * at org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:507) * * at org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:382) * * at org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:175)* * at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1589)* * at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1540)* * at org.apache.pig.PigServer.registerQuery(PigServer.java:540)* * at org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:970)* * at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:386) * * at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:189) * * at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165) * * at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)* * at org.apache.pig.Main.run(Main.java:555)* * at org.apache.pig.Main.main(Main.java:111)* * at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)* * at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) * * at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) * * at java.lang.reflect.Method.invoke(Method.java:597)* * at org.apache.hadoop.util.RunJar.main(RunJar.java:156)* *Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.filter.WritableByteArrayComparable* * at java.net.URLClassLoader$1.run(URLClassLoader.java:202)* * at java.security.AccessController.doPrivileged(Native Method)* * at java.net.URLClassLoader.findClass(URLClassLoader.java:190)* * at java.lang.ClassLoader.loadClass(ClassLoader.java:306)* * at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)* * at java.lang.ClassLoader.loadClass(ClassLoader.java:247)* * ... 28 more* * ===============================================================================* > Thanks, > Cheolsoo > > On Thu, Oct 25, 2012 at 7:44 AM, Manu S <[EMAIL PROTECTED]> wrote: > > > Hi, > > > > I am using Pig-0.10.0 & hbase-0.94.2. > > > > I am trying to store the processed output to Hbase cluster using pig > > script. > > > > I registered the required .jar and set the mapreduce and zookeeper > > parameters within the script itself. > > > > *# cat input.pig* > > register jar/hbase-0.94.2.jar; > > register jar/zookeeper-3.4.3.jar; > > register jar/protobuf-java-2.4.0a.jar; > > register jar/guava-11.0.2.jar; > > register jar/pig-0.10.0.jar; > > > > set fs.default.name hdfs://namenode:8020;
+
Manu S 2012-10-26, 03:57
-
Re: Pig + Hbase integration
Cheolsoo Park 2012-10-26, 05:57
Hi Manu,
Thanks for providing the log.
1) ClassNotFoundError
Even though you're "registering" jars in your script, they're not present in classpath. So you're seeing that ClassNotFound error. Can you try this?
PIG_CLASSPATH=<hbase_home>/hbase-0.94.1.jar:<hbase_home>/lib/zookeeper-3.4.3.jar:<hbase_home>/lib/protobuf-java-2.4.0a.jar ./bin/pig <your script>
The best way to use HBaseStorage is to install the hbase client locally, so they're present in classpath automatically. Then, you don't have to add them to PIG_CLASSPATH.
2) pig-0.10.0.jar
Can you also make sure that you use pig-0.10.0-withouthadoop.jar instead of pig-0.10.0.jar? Pig.jar embeds hbase-0.90, so you will run into a compatibly issue if you run it against hbase-0.94.
Thanks, Cheolsoo
On Thu, Oct 25, 2012 at 8:57 PM, Manu S <[EMAIL PROTECTED]> wrote:
> Hi Cheolsoo, > > Please find the log > > On Thu, Oct 25, 2012 at 10:19 PM, Cheolsoo Park <[EMAIL PROTECTED] > >wrote: > > > Hi Manu, > > > > Can you provide the output of > > 'cat /export/home/hadoop/devel/pig/pig_1351175108325.log' ? > > > > > > *Pig Stack Trace* > *---------------* > *ERROR 2998: Unhandled internal error. > org/apache/hadoop/hbase/filter/WritableByteArrayComparable* > * > * > *java.lang.NoClassDefFoundError: > org/apache/hadoop/hbase/filter/WritableByteArrayComparable* > * at java.lang.Class.forName0(Native Method)* > * at java.lang.Class.forName(Class.java:247)* > * at > org.apache.pig.impl.PigContext.resolveClassName(PigContext.java:477)* > * at > > org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:507)* > * at > > org.apache.pig.parser.LogicalPlanBuilder.validateFuncSpec(LogicalPlanBuilder.java:791) > * > * at > > org.apache.pig.parser.LogicalPlanBuilder.buildFuncSpec(LogicalPlanBuilder.java:780) > * > * at > > org.apache.pig.parser.LogicalPlanGenerator.func_clause(LogicalPlanGenerator.java:4583) > * > * at > > org.apache.pig.parser.LogicalPlanGenerator.store_clause(LogicalPlanGenerator.java:6225) > * > * at > > org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1335) > * > * at > > org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:789) > * > * at > > org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:507) > * > * at > > org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:382) > * > * at > org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:175)* > * at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1589)* > * at > org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1540)* > * at org.apache.pig.PigServer.registerQuery(PigServer.java:540)* > * at > org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:970)* > * at > > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:386) > * > * at > > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:189) > * > * at > > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165) > * > * at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)* > * at org.apache.pig.Main.run(Main.java:555)* > * at org.apache.pig.Main.main(Main.java:111)* > * at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)* > * at > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > * > * at > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > * > * at java.lang.reflect.Method.invoke(Method.java:597)* > * at org.apache.hadoop.util.RunJar.main(RunJar.java:156)* > *Caused by: java.lang.ClassNotFoundException: > org.apache.hadoop.hbase.filter.WritableByteArrayComparable* > * at java.net.URLClassLoader$1.run(URLClassLoader.java:202)* > * at java.security.AccessController.doPrivileged(Native Method)*
+
Cheolsoo Park 2012-10-26, 05:57
-
Re: Pig + Hbase integration
Manu S 2012-10-26, 06:44
Wow, its fixed :)
Thanks a lot Cheolsoo for your quick solution.
Thanks, Manu S
On Fri, Oct 26, 2012 at 11:27 AM, Cheolsoo Park <[EMAIL PROTECTED]>wrote:
> Hi Manu, > > Thanks for providing the log. > > 1) ClassNotFoundError > > Even though you're "registering" jars in your script, they're not present > in classpath. So you're seeing that ClassNotFound error. Can you try this? > > > PIG_CLASSPATH=<hbase_home>/hbase-0.94.1.jar:<hbase_home>/lib/zookeeper-3.4.3.jar:<hbase_home>/lib/protobuf-java-2.4.0a.jar > ./bin/pig <your script> > > The best way to use HBaseStorage is to install the hbase client locally, so > they're present in classpath automatically. Then, you don't have to add > them to PIG_CLASSPATH. > > 2) pig-0.10.0.jar > > Can you also make sure that you use pig-0.10.0-withouthadoop.jar instead of > pig-0.10.0.jar? Pig.jar embeds hbase-0.90, so you will run into > a compatibly issue if you run it against hbase-0.94. > > Thanks, > Cheolsoo > > On Thu, Oct 25, 2012 at 8:57 PM, Manu S <[EMAIL PROTECTED]> wrote: > > > Hi Cheolsoo, > > > > Please find the log > > > > On Thu, Oct 25, 2012 at 10:19 PM, Cheolsoo Park <[EMAIL PROTECTED] > > >wrote: > > > > > Hi Manu, > > > > > > Can you provide the output of > > > 'cat /export/home/hadoop/devel/pig/pig_1351175108325.log' ? > > > > > > > > > > *Pig Stack Trace* > > *---------------* > > *ERROR 2998: Unhandled internal error. > > org/apache/hadoop/hbase/filter/WritableByteArrayComparable* > > * > > * > > *java.lang.NoClassDefFoundError: > > org/apache/hadoop/hbase/filter/WritableByteArrayComparable* > > * at java.lang.Class.forName0(Native Method)* > > * at java.lang.Class.forName(Class.java:247)* > > * at > > org.apache.pig.impl.PigContext.resolveClassName(PigContext.java:477)* > > * at > > > > > org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:507)* > > * at > > > > > org.apache.pig.parser.LogicalPlanBuilder.validateFuncSpec(LogicalPlanBuilder.java:791) > > * > > * at > > > > > org.apache.pig.parser.LogicalPlanBuilder.buildFuncSpec(LogicalPlanBuilder.java:780) > > * > > * at > > > > > org.apache.pig.parser.LogicalPlanGenerator.func_clause(LogicalPlanGenerator.java:4583) > > * > > * at > > > > > org.apache.pig.parser.LogicalPlanGenerator.store_clause(LogicalPlanGenerator.java:6225) > > * > > * at > > > > > org.apache.pig.parser.LogicalPlanGenerator.op_clause(LogicalPlanGenerator.java:1335) > > * > > * at > > > > > org.apache.pig.parser.LogicalPlanGenerator.general_statement(LogicalPlanGenerator.java:789) > > * > > * at > > > > > org.apache.pig.parser.LogicalPlanGenerator.statement(LogicalPlanGenerator.java:507) > > * > > * at > > > > > org.apache.pig.parser.LogicalPlanGenerator.query(LogicalPlanGenerator.java:382) > > * > > * at > > > org.apache.pig.parser.QueryParserDriver.parse(QueryParserDriver.java:175)* > > * at > org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1589)* > > * at > > org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1540)* > > * at org.apache.pig.PigServer.registerQuery(PigServer.java:540)* > > * at > > org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:970)* > > * at > > > > > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:386) > > * > > * at > > > > > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:189) > > * > > * at > > > > > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165) > > * > > * at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)* > > * at org.apache.pig.Main.run(Main.java:555)* > > * at org.apache.pig.Main.main(Main.java:111)* > > * at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)* > > * at > > > > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > > * > > * at > > > >
+
Manu S 2012-10-26, 06:44
|
|