-Re: ORC with Map Column Type using Hive 0.11.0-RC1
Owen O'Malley 2013-05-03, 21:20
On Fri, May 3, 2013 at 10:20 AM, Andrew Psaltis <
[EMAIL PROTECTED]> wrote:
> I am trying to evaluate Hive 0.11.0-RC1, in particular I am very
> interested in the ORC storage mechanism. We have a need to have one column
> be a Map<String,String> in a table and from what I have read this is
> supported with the ORC format, however when trying to do a select on a
> table with a Map column I get an exception (the stack trace and more
> details is below). Here is what I have done to test this:
I've create https://issues.apache.org/jira/browse/HIVE-4494 with the issue.
> 1. I was under the impression, albeit perhaps wrong, that with ORC
> only the columns being selected would be deserialized. If that is true,
> then why would the map be deserialized when my query was for the column
> that is a string type and the map is not needed to satisfy the query?
> ORC only reads and deserializes the columns that Hive requests, but the
rows returned are required to have all of the columns. The values for the
ignored column will always be null. You are hitting a bug in setting up the
objectinspectors for reading the columns.
Ironically, doing select * from the table works fine.
> 1. Is there something I am doing wrong here? If not what can I do to
> help track down the source of the problem? I have tried this test using a
> map<int,int> and get the same results. Also, I have been trying to run the
> ORC Junit tests with Eclipse but have been having a dickens of time getting
> that to work.
> It looks at a first pass that I need to make OrcMapObjectInspector