Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Sqoop >> mail # dev >> Review Request: BigDecimals converted to String in inconsistent format

Copy link to this message
Review Request: BigDecimals converted to String in inconsistent format

This is an automatically generated e-mail. To reply, visit:

Review request for Sqoop.

Currently when BigDecimal fields are saved as Strings Sqoop uses the ToString method. This leads to values like "0.0000001" being stored as "1E-7" which doesn't seem ideal.
This patch changes Sqoop to use ToPlainString for BigDecimals so they will always be stored in the same format. This should have minimal effect as they can still be converted back to BigDecimals no matter which way they are stored - and the scale doesn't seem relevant - it seems to always be zero anyway so there shouldn't be any change there.
I added a new parameter "sqoop.bigdecimal.format.string" which can be set to false to revert to the old behaviour.
I didn't add this as a command line parameter as it seems like something most users would not change so didn't want to confuse the user with another option - they can override it in sqoop-site.xml or on the command line using -D.
This addresses bug SQOOP-830.

  src/java/org/apache/sqoop/hbase/HBasePutProcessor.java 64a1d18
  src/java/org/apache/sqoop/hbase/ToStringPutTransformer.java 1f52ba9
  src/java/org/apache/sqoop/mapreduce/AvroImportMapper.java 30db288
  src/java/org/apache/sqoop/mapreduce/ImportJobBase.java f6e2e72
  src/java/org/apache/sqoop/orm/ClassWriter.java 126b406

Diff: https://reviews.apache.org/r/9081/diff/

Have manually tested text file, avro file and hbase imports using both values of the new parameter.
Also checked that if the parameter is not set it will use the toPlainString.
I tested sequence files but there is no change as they don't use the toString methods.

David Robson