Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Sqoop >> mail # dev >> Review Request: SQOOP-428: Support compression for Avro import


Copy link to this message
-
Re: Review Request: SQOOP-428: Support compression for Avro import


> On 2012-01-24 21:50:05, Tom White wrote:
> > Looks good. Have you run manual tests with it too?

Yes, I've used it and manually checked the result (only with Snappy though) and the result is correct (that's when we stumbled upon SQOOP-429)
> On 2012-01-24 21:50:05, Tom White wrote:
> > src/test/com/cloudera/sqoop/TestAvroImport.java, line 89
> > <https://reviews.apache.org/r/3600/diff/1/?file=70555#file70555line89>
> >
> >     You should check that the files that are written are compressed (by looking at DataFileReader's metadata).
> >    
> >     We also need a test for --compress.

Thanks for the hint. I'll look into it and update the review.
- Lars
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3600/#review4566
-----------------------------------------------------------
On 2012-01-24 14:07:58, Lars Francke wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/3600/
> -----------------------------------------------------------
>
> (Updated 2012-01-24 14:07:58)
>
>
> Review request for Sqoop.
>
>
> Summary
> -------
>
> This basically only ports all the code from Avro's (1.5.4) AvroOutputFormat to the new MR API.
>
> I've changed the test to extract the common functionality into a helper method because they are the same apart from the two command line arguments.
>
> I could have deleted AvroJob completely but as I was told last time that binary compatibility needs to be maintained I left it in. It's not needed anymore as all necessary functionality can be gotten from Avro's own version of that file as far as I can tell. So if it's okay to delete that redundant file (two actually, cloudera and apache package) let me know and I'll provide a new patch.
>
>
> This addresses bug SQOOP-428.
>     https://issues.apache.org/jira/browse/SQOOP-428
>
>
> Diffs
> -----
>
>   src/java/org/apache/sqoop/mapreduce/AvroJob.java a57aaf1
>   src/java/org/apache/sqoop/mapreduce/AvroOutputFormat.java 96befd7
>   src/test/com/cloudera/sqoop/TestAvroImport.java 1b8b046
>
> Diff: https://reviews.apache.org/r/3600/diff
>
>
> Testing
> -------
>
> All tests pass for hadoopversion=20 but TestColumnTypes fails for me on 23. I can't see how that's related though.
>
>
> Thanks,
>
> Lars
>
>