My employer, m6d.com, has given the thumbs up to open source our
latest hive tool, hive-protobuf. We created this because we work with
protobuf formats often and wanted to be able to directly log an query
this types without writing one-off User Defined Functions or Input
Hive-protobuf is much like the new avro support and the already
existing thrift support. Here is how it works:
if you have a sequence file with a serialized protobuf in the key and
a serialized protobuf in the value, a table can be created that
describes the data to hive. The table needs only be configured with
the protobuf generated class name for the key and value and it turns
the nested classes into nested structs.
We eventually will migrate the project into core hive but we want to
let it incubate in github for a time. (For example there is no support
for union types at the moment, maybe other kinks or tunes). Please
checkout the project and send pull requests if you have patches.