Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # dev >> Review Request: PIG-3015 Rewrite of AvroStorage


+
Joseph Adler 2012-11-17, 05:28
+
Cheolsoo Park 2012-12-03, 19:22
+
Joseph Adler 2012-12-06, 00:27
+
Joseph Adler 2012-12-03, 22:55
+
Joseph Adler 2012-12-17, 19:36
+
Joseph Adler 2012-12-20, 17:24
+
Joseph Adler 2013-01-04, 19:22
+
Joseph Adler 2013-01-04, 19:22
+
Jonathan Coveney 2013-03-19, 16:40
+
Joseph Adler 2013-05-20, 16:38
+
Joseph Adler 2013-05-20, 16:38
Copy link to this message
-
Re: Review Request: PIG-3015 Rewrite of AvroStorage

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8104/
-----------------------------------------------------------

(Updated May 20, 2013, 4:39 p.m.)
Review request for pig and Cheolsoo Park.
Changes
-------

Addressed most of Jonathan Coveney's comments
Description
-------

The current AvroStorage implementation has a lot of issues: it requires old versions of Avro, it copies data much more than needed, and it's verbose and complicated. (One pet peeve of mine is that old versions of Avro don't support Snappy compression.)

I rewrote AvroStorage from scratch to fix these issues. In early tests, the new implementation is significantly faster, and the code is a lot simpler. Rewriting AvroStorage also enabled me to implement support for Trevni.

This is the latest version of the patch, complete with test cases and TrevniStorage. (Test cases for TrevniStorage are still missing).
This addresses bug PIG-3015.
    https://issues.apache.org/jira/browse/PIG-3015
Diffs (updated)
-----

  .eclipse.templates/.classpath a213e93
  build.xml aa6e09d
  ivy.xml 3a1cb2e
  ivy/libraries.properties 629feb4
  src/docs/src/documentation/content/xdocs/func.xml 9f8d740
  src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java 5b54490
  src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POStore.java 249aecb
  src/org/apache/pig/builtin/AvroStorage.java PRE-CREATION
  src/org/apache/pig/builtin/TrevniStorage.java PRE-CREATION
  src/org/apache/pig/impl/util/avro/AvroArrayReader.java PRE-CREATION
  src/org/apache/pig/impl/util/avro/AvroBagWrapper.java PRE-CREATION
  src/org/apache/pig/impl/util/avro/AvroMapWrapper.java PRE-CREATION
  src/org/apache/pig/impl/util/avro/AvroRecordReader.java PRE-CREATION
  src/org/apache/pig/impl/util/avro/AvroRecordWriter.java PRE-CREATION
  src/org/apache/pig/impl/util/avro/AvroStorageDataConversionUtilities.java PRE-CREATION
  src/org/apache/pig/impl/util/avro/AvroStorageSchemaConversionUtilities.java PRE-CREATION
  src/org/apache/pig/impl/util/avro/AvroTupleWrapper.java PRE-CREATION
  test/commit-tests c6fbbca
  test/org/apache/pig/builtin/TestAvroStorage.java PRE-CREATION
  test/org/apache/pig/builtin/avro/code/pig/directory_test.pig PRE-CREATION
  test/org/apache/pig/builtin/avro/code/pig/identity.pig PRE-CREATION
  test/org/apache/pig/builtin/avro/code/pig/identity_ai1_ao2.pig PRE-CREATION
  test/org/apache/pig/builtin/avro/code/pig/identity_ao2.pig PRE-CREATION
  test/org/apache/pig/builtin/avro/code/pig/identity_blank_first_args.pig PRE-CREATION
  test/org/apache/pig/builtin/avro/code/pig/identity_codec.pig PRE-CREATION
  test/org/apache/pig/builtin/avro/code/pig/identity_just_ao2.pig PRE-CREATION
  test/org/apache/pig/builtin/avro/code/pig/namesWithDoubleColons.pig PRE-CREATION
  test/org/apache/pig/builtin/avro/code/pig/projection_test.pig PRE-CREATION
  test/org/apache/pig/builtin/avro/code/pig/recursive_tests.pig PRE-CREATION
  test/org/apache/pig/builtin/avro/code/pig/trevni_to_avro.pig PRE-CREATION
  test/org/apache/pig/builtin/avro/code/pig/trevni_to_trevni.pig PRE-CREATION
  test/org/apache/pig/builtin/avro/data/json/arrays.json PRE-CREATION
  test/org/apache/pig/builtin/avro/data/json/arraysAsOutputByPig.json PRE-CREATION
  test/org/apache/pig/builtin/avro/data/json/projectionTest.json PRE-CREATION
  test/org/apache/pig/builtin/avro/data/json/recordWithRepeatedSubRecords.json PRE-CREATION
  test/org/apache/pig/builtin/avro/data/json/records.json PRE-CREATION
  test/org/apache/pig/builtin/avro/data/json/recordsAsOutputByPig.json PRE-CREATION
  test/org/apache/pig/builtin/avro/data/json/recordsOfArrays.json PRE-CREATION
  test/org/apache/pig/builtin/avro/data/json/recordsOfArraysOfRecords.json PRE-CREATION
  test/org/apache/pig/builtin/avro/data/json/recordsSubSchema.json PRE-CREATION
  test/org/apache/pig/builtin/avro/data/json/recordsSubSchemaNullable.json PRE-CREATION
  test/org/apache/pig/builtin/avro/data/json/recordsWithDoubleUnderscores.json PRE-CREATION
  test/org/apache/pig/builtin/avro/data/json/recordsWithEnums.json PRE-CREATION
  test/org/apache/pig/builtin/avro/data/json/recordsWithFixed.json PRE-CREATION
  test/org/apache/pig/builtin/avro/data/json/recordsWithMaps.json PRE-CREATION
  test/org/apache/pig/builtin/avro/data/json/recordsWithMapsOfRecords.json PRE-CREATION
  test/org/apache/pig/builtin/avro/data/json/recordsWithNullableUnions.json PRE-CREATION
  test/org/apache/pig/builtin/avro/data/json/recursiveRecord.json PRE-CREATION
  test/org/apache/pig/builtin/avro/schema/arrays.avsc PRE-CREATION
  test/org/apache/pig/builtin/avro/schema/arraysAsOutputByPig.avsc PRE-CREATION
  test/org/apache/pig/builtin/avro/schema/projectionTest.avsc PRE-CREATION
  test/org/apache/pig/builtin/avro/schema/recordWithRepeatedSubRecords.avsc PRE-CREATION
  test/org/apache/pig/builtin/avro/schema/records.avsc PRE-CREATION
  test/org/apache/pig/builtin/avro/schema/recordsAsOutputByPig.avsc PRE-CREATION
  test/org/apache/pig/builtin/avro/schema/recordsOfArrays.avsc PRE-CREATION
  test/org/apache/pig/builtin/avro/schema/recordsOfArraysOfRecords.avsc PRE-CREATION
  test/org/apache/pig/builtin/avro/schema/recordsSubSchema.avsc PRE-CREATION
  test/org/apache/pig/builtin/avro/schema/recordsSubSchemaNullable.avsc PRE-CREATION
  test/org/apache/pig/builtin/avro/schema/recordsWithDoubleUnderscores.avsc PRE-CREATION
  test/org/apache/pig/builtin/avro/schema/recordsWithEnums.avsc PRE-CREATION
  test/org/apache/pig/builtin/avro/schema/recordsWithFixed.avsc PRE-CREATION
  test/org/apache/pig/builtin/avro/schema/recordsWithMaps.avsc PRE-CREATION
  test/org/apache/pig/builtin/avro/schema/recordsWithMapsOfRecords.avsc PRE-CREATION
  test/org/apache/pig/builtin/avro/schema/recordsWithNullableUnions.avsc PRE-CREATION
  test/org/apache/pig/builtin/avro/sch