Ken Krugler 2013-03-01, 04:02
Trevni is alive and well. It's been included in the last two Avro
releases, 1.7.3 and 1.7.4. There were some good improvements to
Trevni in 1.7.4, so use that version if possible.
Patches that add Trevni support to Pig and Hive are available:
If you already use Avro then Trevni's easy to incorporate. For
MapReduce jobs, you can write Trevni output from a program that
produced Avro before by simply changing the OutputFormat. Similarly,
to read Trevni input in MapReduce simply change the InputFormat and
specify a subset schema (deleting fields you don't need, i.e.,
So it shouldn't be hard to use Trevni with Cascading. If you work on
this, please let us know how it goes.
On Thu, Feb 28, 2013 at 8:02 PM, Ken Krugler
<[EMAIL PROTECTED]> wrote:
> Hi all,
> Any input as to the status of Trevni?
> I'm researching column-oriented file formats that aren't tightly coupled to specific platforms - this precludes ORCFile, for example.
> CIF seemed interesting, but IBM hasn't released the code. And Trevni seems to be a reasonable open source implementation of what they describe.
> But I hadn't heard much about Trevni recently, or if anybody is using it for real work.
> I see it mentioned in conjunction with Impala, but it sounds like that's on the roadmap versus being available yet.
> For context, I'm looking into using a column store to speed up Cascading workflows.
> -- Ken
> Ken Krugler
> +1 530-210-6378
> custom big data solutions & training
> Hadoop, Cascading, Cassandra & Solr