Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Hive - Issue Converting Text to Orc


Copy link to this message
-
Re: Hive - Issue Converting Text to Orc
Hi Bryan

My apologize for the delayed response. I am still on vacation and couldn’t get much time to work on this issue. I was able to figure out the reason for this issue. This issue is related to incompatibility between versions of protobuf of generated code (OrcProto.java) and the runtime protobuf jar (protobuf-java-2.x.x.jar). This incompatibility is also addressed here https://code.google.com/p/protobuf/issues/detail?id=493

To reproduce your exception, I tried the following
1) Installed protoc version 2.4.1.
2) Compiled hive source and generated OrcProto.java (having protoc in PATH). I used the following command
mvn package -DskipTests -Phadoop-2,protobuf -Pdist
     This command compiles the orcproto.proto file with 2.4.1 and the maven package will pull protobuf-2.5.0.jar as a part of dependency.
3) ran hive cli and followed the steps that you had mentioned in this mail thread

Doing the above steps resulted in the same exception as you had posted.

The reason for the exception in your case might be, hive-0.12.0 binary download uses protobuf 2.4.1 to compile proto file and hadoop-2.2.0 uses protobuf 2.5.0. When protobuf-java-2.5.0.jar is present in your classpath, it will throw runtime exception. To avoid this try the solution below

Solution:
1) Install protoc 2.5.0, compile hive 0.12.0 (ant protobuf && ant clean package)
2) Remove protobuf-java-2.4.1.jar pulled by ivy from build/dist/lib directory
3) Copy protbuf-java-2.5.0.jar from hadoop-2.2.0/share/hadoop/common/lib to hive/build/dist/lib
3) Run hive cli and rerun your queries.

Let me know if this works.

Thanks
Prasanth Jayachandran

On Dec 31, 2013, at 2:54 AM, Bryan Jeffrey <[EMAIL PROTECTED]> wrote:

> Prasanth,
>
> Any luck?
>
>
> On Tue, Dec 24, 2013 at 4:31 PM, Bryan Jeffrey <[EMAIL PROTECTED]> wrote:
> Prasanth,
>
> I am also traveling this week.  Your assistance would be appreciated, but not at the expense of your holiday!
>
> Bryan
>
> On Dec 24, 2013 2:23 PM, "Prasanth Jayachandran" <[EMAIL PROTECTED]> wrote:
> Bryan
>
> I have a similar setup. I will try to reproduce this issue and get back to you asap.
> Since i am traveling expect some delay.  
>
> Thanks
> Prasanth
>
> Sent from my iPhone
>
> On Dec 24, 2013, at 11:39 AM, Bryan Jeffrey <[EMAIL PROTECTED]> wrote:
>
>> Hello.  
>>
>> I posted this a few weeks ago, but was unable to get a response that solved the issue.  I have made no headway in the mean time.  I was hoping that if I re-summarized the issue that someone would have some advice regarding this problem.
>> Running the following version of Hadoop: hadoop-2.2.0
>> Running the following version of Hive: hive-0.12.0
>>
>> I have a simple test system setup with (2) datanodes/node manager and (1) namenode/resource manager.  Hive is running on the namenode, and contacting a MySQL database for metastore.
>>
>> I have created a small table 'from_text' as follows:
>>
>> [server:10001] hive> describe from_text;
>> foo                     int                     None
>> bar                     int                     None
>> boo                     string                  None
>>
>>
>> [server:10001] hive> select * from from_text;
>> 1       2       Hello
>> 2       3       World
>>
>> I go to insert the data into my Orc table, 'orc_table':
>>
>> [server:10001] hive> describe orc_test;
>> foo                     int                     from deserializer
>> bar                     int                     from deserializer
>> boo                     string                  from deserializer
>>
>>
>> The job runs, but fails to complete with the following errors (see below).  This seems to be the exact example covered in the example here:
>>
>> http://hortonworks.com/blog/orcfile-in-hdp-2-better-compression-better-performance/
>>
>> The error output is below.  I have tried several things to solve the issue, including re-installing Hive 0.12.0 from binary install.
>>
>> Help?
>>
>> [server:10001] hive> insert into table orc_test select * from from_text;
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to
which it is addressed and may contain information that is confidential,
privileged and exempt from disclosure under applicable law. If the reader
of this message is not the intended recipient, you are hereby notified that
any printing, copying, dissemination, distribution, disclosure or
forwarding of this communication is strictly prohibited. If you have
received this communication in error, please contact the sender immediately
and delete it from your system. Thank You.
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB