There’s actually a different inputformat for vectorized processing on RCFile. See https://issues.apache.org/jira/browse/HIVE-4483. Vectorized execution won’t run as fast on RCFile as ORC, but there should still be a noticeable improvement on RCFile.
In the future, I think it’s best to update the standard input formats, so they can work vectorized or row-at-a-time. This makes for easier evolution to allow vectorization to run against existing tables. This was done for ORC.
I’m not sure how deep the testing was on running queries using the inputformat from HIVE-4483 with RC File. It is much less than for vectorized query on ORC.
From: Rajesh Balamohan [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, January 8, 2014 6:47 PM
To: [EMAIL PROTECTED]
Subject: Vectorizied execution on RCFile
Vectorization with ORCFile provides amazing performance. Does vectorization work with RCFile as well?
As per explain plan of Hive 0.13 (snapshot), it does not use vectorization with RCFile. Any pointers would be appreciated.