Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # user >> generating ORC file as output of a mapreduce job


Copy link to this message
-
generating ORC file as output of a mapreduce job
Hi,
I am writing a MR job to generate data for Hive.

the code generates output with Text format pretty OK

job.setOutputKeyClass(NullWritable.class);

job.setOutputValueClass(Text.class);
But when I change the value class from Text.class to OrcOutputFormat.class,
it throw exception
2013-11-20 00:50:50,613 FATAL [main]
org.apache.hadoop.mapred.YarnChild: Error running child :
java.lang.VerifyError: class
org.apache.hadoop.ipc.protobuf.RpcHeaderProtos$RpcRequestHeaderProto
overrides final method
getUnknownFields.()Lcom/google/protobuf/UnknownFieldSet;
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:791)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
at org.apache.hadoop.util.ProtoUtil.makeRpcRequestHeader(ProtoUtil.java:165)
at org.apache.hadoop.ipc.Client$Connection.<init>(Client.java:362)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1389)
at org.apache.hadoop.ipc.Client.call(Client.java:1318)
at org.apache.hadoop.ipc.Client.call(Client.java:1300)
at org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:231)
at sun.proxy.$Proxy6.getTask(Unknown Source)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:133)

My objective is generating ORC file as output a MR job, so that I can
load data into Hive directly. If other approach also serve the same
objective, that will be nice. Is there any HCatlog utility I can use
do it ?
Thanks a lot,

Johnny
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB