Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig, mail # dev - Review Request: PIG-3059 Global configurable minimum 'bad record' thresholds


Copy link to this message
-
Review Request: PIG-3059 Global configurable minimum 'bad record' thresholds
Cheolsoo Park 2012-12-26, 06:40

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/8765/
-----------------------------------------------------------

Review request for pig, Santhosh Srinivasan, Jonathan Coveney, and Joseph Adler.
Description
-------

This patch implements configurable bad records thresholds based on work done by Jonathan in PIG-2614.

The changes include:
- Adds new Pig properties - pig.load.bad.record.threshold and pig.load.bad.record.min.
- Removes 'ignore_bad_files' option from AvroStorage since it's no longer needed.
- Incorporates InputErrorTracker class written by Jonathan in PIG-2614.
- Adds a try-catch block to nextKeyValue() method in PigRecordReader.
- Adds new test cases to TestAvroStorage for these new properties.
This addresses bug PIG-3059.
    https://issues.apache.org/jira/browse/PIG-3059
Diffs
-----

  conf/pig.properties 001a75e
  contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/AvroStorage.java 771c313
  contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigAvroInputFormat.java 0a84915
  contrib/piggybank/java/src/main/java/org/apache/pig/piggybank/storage/avro/PigAvroRecordReader.java 9c37fec
  contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/TestAvroStorage.java 28a448f
  contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/avro_test_files/expected_testCorruptedFile.avro 4670aae
  contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/avro_test_files/expected_testCorruptedFile2.avro PRE-CREATION
  contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/avro_test_files/expected_testCorruptedFile3.avro PRE-CREATION
  contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/avro_test_files/test_corrupted_file.avro 78c1c12
  contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/avro_test_files/test_corrupted_file/bad.avro PRE-CREATION
  contrib/piggybank/java/src/test/java/org/apache/pig/piggybank/test/storage/avro/avro_test_files/test_corrupted_file/good.avro PRE-CREATION
  src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/InputErrorTracker.java PRE-CREATION
  src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigRecordReader.java 6c77bad

Diff: https://reviews.apache.org/r/8765/diff/
Testing
-------

ant clean commit-test
ant clean compile-test jar-withouthadoop
cd contrib/piggybank/java
ant clean test -Dtestcase=TestAvroStorage
Thanks,

Cheolsoo Park

+
Cheolsoo Park 2012-12-31, 01:56