Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # dev >> HIVE-4053 | Review request


+
Krishna 2013-02-23, 05:48
Copy link to this message
-
Re: HIVE-4053 | Review request
Krishna,
Can you please post a patch on the JIRA and post a review on
reviewboard? You should also consider adding some unit tests. If you
need help with any of this, please let us know.

I will post this on JIRA as well for completeness.

Mark

On Fri, Feb 22, 2013 at 9:48 PM, Krishna <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I've implemented 'Refined Soundex' algorithm using a GenericUDF and would
> like to share it for a review by experts as I'm a newbie.
>
> Change Details:
> A new java class is created: GenericUDFRefinedSoundex.java
> Add a entry to FunctionRegistry.java: registerGenericUDF("soundex_ref",
> GenericUDFRefinedSoundex.class);
>
> Both files are attached to the email.
>
> I'm planning to implement other phonetic algorithms and submit all as a
> single patch. I understand there are many other steps that I need to finish
> before a patch is ready but for now, if you could review the attached code
> and provide feedback, it'll be great.
>
> Here are the details of Refined Soundex algorithm:
> First letter is stored
> Subsequent letters are replaced by numbers as defined below-
>  * B, P => 1
>  * F, V => 2
>  * C, K, S => 3
>  * G, J => 4
>  * Q, X, Z => 5
>  * D, T => 6
>  * L => 7
>  * M, N => 8
>  * R => 9
>  * Other letters => 0
> Consecutive letters belonging to the same group are replaced by one letter
>
> Example:
>> SELECT soundex_ref('Carren') FROM src LIMIT 1;
>> C30908
>
> Thanks,
> Krishna
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB