Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Sqoop >> mail # dev >> Review Request 12089: SQOOP-1101 Sqoop2: Basic schema: Provide implementation of the proposed data types

Copy link to this message
Re: Review Request 12089: SQOOP-1101 Sqoop2: Basic schema: Provide implementation of the proposed data types

> On June 25, 2013, 11:27 p.m., Abraham Elmahrek wrote:
> > Overall, looks good. I am mostly unsure about the use of 31 as the only prime number separating hash code values.
> Jarek Cecho wrote:
>     Hi Abe,
>     thank you very much for the review, appreciated! All the hashCode methods were generated by my IDE (IntelliJ IDEA) including the magical constant 31. I assume that this is best practice, however please correct me if I'm wrong.
> Abraham Elmahrek wrote:
>     The number 31 seems arbitrary. My understanding is that hash codes with multiple instance members involved require some level of separation. By consistently using the number 31, there is a small chance that there will be some overlap between the hash codes produced by the members. It should increase the chance of a collision... the amount it increases it by is likely small though if the members are strings. If the members are Boolean, the overlap is guaranteed... IE: class A: member1 = Boolean,  member1 = Boolean, hashCode = func { result = 0; if member1 != null, then result = member1.hashCode(); if member2 != null, then result = 31*result + member2.hashCode(); }. NOTE: Boolean in java has two values for its hash code.
> Venkat Ranganathan wrote:
>     I think the basic idea is to have an hash code that is somewhat similar to member1*31^(n-1) + member2*31^(n-2) + ... + member[n] using int arithmetic (where member1 is the hashcode of member object 1 for example computed similarly).  But it is done with integer arithmetic.   I have seen other numbers used, like say 37, but may be that has better dispersion but still will be an arbitrary number, right?

That seems right. It's just the implementation is a bit skeptical given null values in members effectively zero out an iteration providing more opportunity for collisions. IE the scenario I've stated above.

Anyways, I don't think it matters too much. If you guys are fine with it. I am too.
- Abraham
This is an automatically generated e-mail. To reply, visit:
On June 25, 2013, 9:42 p.m., Jarek Cecho wrote:
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/12089/
> -----------------------------------------------------------
> (Updated June 25, 2013, 9:42 p.m.)
> Review request for Sqoop.
> Bugs: SQOOP-1101
>     https://issues.apache.org/jira/browse/SQOOP-1101
> Repository: sqoop-sqoop2
> Description
> -------
> This part of SQOOP-1073 is adding classes representing the data types proposed in SQOOP-515.
> Diffs
> -----
>   common/pom.xml 2921800
>   common/src/main/java/org/apache/sqoop/schema/Schema.java PRE-CREATION
>   common/src/main/java/org/apache/sqoop/schema/SchemaError.java PRE-CREATION
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractComplexType.java PRE-CREATION
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractDateTime.java PRE-CREATION
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractNumber.java PRE-CREATION
>   common/src/main/java/org/apache/sqoop/schema/type/AbstractString.java PRE-CREATION
>   common/src/main/java/org/apache/sqoop/schema/type/Array.java PRE-CREATION
>   common/src/main/java/org/apache/sqoop/schema/type/Binary.java PRE-CREATION
>   common/src/main/java/org/apache/sqoop/schema/type/Bit.java PRE-CREATION
>   common/src/main/java/org/apache/sqoop/schema/type/Column.java PRE-CREATION
>   common/src/main/java/org/apache/sqoop/schema/type/Date.java PRE-CREATION
>   common/src/main/java/org/apache/sqoop/schema/type/DateTime.java PRE-CREATION
>   common/src/main/java/org/apache/sqoop/schema/type/Decimal.java PRE-CREATION
>   common/src/main/java/org/apache/sqoop/schema/type/Enum.java PRE-CREATION
>   common/src/main/java/org/apache/sqoop/schema/type/FixedPoint.java PRE-CREATION