Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Sqoop, mail # dev - RE: [jira] [Created] (SQOOP-830) HBase import formatting BigDecimal inconsistently


Copy link to this message
-
RE: [jira] [Created] (SQOOP-830) HBase import formatting BigDecimal inconsistently
David Robson 2013-01-18, 03:24
It is always possible to add rows to HBase in any format you want by creating a custom put transformer - but that is quite difficult to do...

Do you think it's worth adding another property for this? Something like "bigdecimal.format.string"?

What I was thinking is I should change how BigDecimals are handled in codegen as well (so it applies for text files as well) - so basically wherever we call BigDecimal.toString() should be changed to "BigDecimal.toPlainString()" if the new parameter is set to true - and I propose making it true by default.

What do you think?

David Robson
Software Developer
Dell | R&D, Quest Software
office +61 3 9811 8082

Quest Software is now part of Dell
-----Original Message-----
From: Jarek Jarcec Cecho [mailto:[EMAIL PROTECTED]]
Sent: Thursday, 17 January 2013 5:03 PM
To: [EMAIL PROTECTED]
Subject: Re: [jira] [Created] (SQOOP-830) HBase import formatting BigDecimal inconsistently

Hi Rob,
thank you for taking a look on this issue!

I don't think that this will break anything as the number will still be a number just in different format. But just in case, what about making this adjustable from command line?

Jarcec

On Wed, Jan 16, 2013 at 11:46:59PM +0000, David Robson wrote:
> Hi,
>
> I was thinking of changing this to use "toPlainString()" instead of "toString()" on BigDecimals as it seems confusing to have some numbers stored as normal decimals and other stored in scientific notation.
>
> Does anyone have any input on this? Do you think this would break anyone's processes? It seems like either one can be converted to a BigDecimal using the constructor so I don't think it will break anything. Is anyone storing BigDecimals in HBase and has any input?
>
> Thanks,
>
> David Robson
> Software Developer
> Dell | R&D, Quest Software
> office +61 3 9811 8082
>
> Quest Software is now part of Dell
>
>
> -----Original Message-----
> From: David Robson (JIRA) [mailto:[EMAIL PROTECTED]]
> Sent: Wednesday, 16 January 2013 10:12 AM
> To: [EMAIL PROTECTED]
> Subject: [jira] [Created] (SQOOP-830) HBase import formatting BigDecimal inconsistently
>
> David Robson created SQOOP-830:
> ----------------------------------
>
>              Summary: HBase import formatting BigDecimal inconsistently
>                  Key: SQOOP-830
>                  URL: https://issues.apache.org/jira/browse/SQOOP-830
>              Project: Sqoop
>           Issue Type: Bug
>             Reporter: David Robson
>
>
> When importing into HBase the toString() method is called on every field via the ToStringPutTransformer class.
> When the field is mapped as a BigDecimal - as it is with number fields in Oracle - this results in inconsistent formats in HBase.
> For example - create the following in Oracle:
>
> CREATE TABLE employee(id number primary key, test_number number); INSERT INTO employee values(1, 0.000001); INSERT INTO employee values(2, 0.0000001); COMMIT;
>
> Then run an import:
>
> sqoop import --connect jdbc:oracle:thin:@//HOSTNAME/SERVICE --username USERNAME --table EMPLOYEE --password PASSWORD --hbase-table EMPLOYEE --column-family tst --hbase-create-table
>
> The value for row 1 is "0.000001" while row 2 is "1E-7".
>
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira