Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hive >> mail # dev >> Review Request: Add support for pulling HBase columns with prefixes


+
Swarnim Kulkarni 2013-02-03, 01:02
+
Swarnim Kulkarni 2013-02-03, 01:04
+
Mark Grover 2013-02-05, 03:43
+
Swarnim Kulkarni 2013-02-09, 15:21
Copy link to this message
-
Re: Review Request: Add support for pulling HBase columns with prefixes


> On Feb. 5, 2013, 3:43 a.m., Mark Grover wrote:
> > hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSerDe.java, line 192
> > <https://reviews.apache.org/r/9276/diff/1/?file=254957#file254957line192>
> >
> >     This seems like a limited case of pattern matching. Swarnim, any way we can support generic regex matching instead?
>
> Swarnim Kulkarni wrote:
>     Mark, in this case I specifically wanted to only allow strings that end with exactly the character "*" and using String#endsWith seemed more simpler and readable than a regex. Do you still want me to replace this with a regex matching?
>
> Brock Noland wrote:
>     I think the issue is that this would make it difficult to implement enhanced pattern matching later. Implementing it now, you'd only need to specify:
>    
>     col.*
>    
>     in the table configuration. Now the issue would be detecting if the particular column was a regex pattern. Because #, comma, and : are used as separators that would exclude those characters from being used.
>
> Swarnim Kulkarni wrote:
>     Thanks Brock. Makes sense. To be sure I am understanding you right, the change now would be just to replace the "parts[1].endsWith(*)" with something more regexy that would still imply that the string ends with "*". Correct?
>
> Mark Grover wrote:
>     I think that should be do it.
>    
>     Personally, I think having limited regex matching is just going to confuse people, so if you could implement (and test) full Nava style regex matching (like we do for RegexSerDe for example), that would be fantastic. Of course, let me know if you have questions!
>    
>     Thanks for doing this, BTW!

Thanks for the suggestions. I incorporated them and updated the review. If you get a chance, please let me know if they look any better.
- Swarnim
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/9276/#review16080
-----------------------------------------------------------
On Feb. 9, 2013, 9:56 p.m., Swarnim Kulkarni wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/9276/
> -----------------------------------------------------------
>
> (Updated Feb. 9, 2013, 9:56 p.m.)
>
>
> Review request for hive.
>
>
> Description
> -------
>
> Added support for pulling hbase columns just by providing prefixes and a wildcard. So a query now could look something like this:
>
> CREATE EXTERNAL TABLE hive_hbase_test
> ROW FORMAT SERDE 'org.apache.hadoop.hive.hbase.HBaseSerDe'
> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,fam1:col*")
> TBLPROPERTIES ("hbase.table.name" = "TEST_HBASE_TABLE");
>
> This would pull in all columns under column family "fam1" which start with "col". This gives a little more flexibility over pull all columns format.
>
>
> This addresses bug HIVE-3725.
>     https://issues.apache.org/jira/browse/HIVE-3725
>
>
> Diffs
> -----
>
>   hbase-handler/src/java/org/apache/hadoop/hive/hbase/HBaseSerDe.java 7f37ba5
>   hbase-handler/src/java/org/apache/hadoop/hive/hbase/LazyHBaseCellMap.java a8ba9d9
>   hbase-handler/src/java/org/apache/hadoop/hive/hbase/LazyHBaseRow.java d35bb52
>   hbase-handler/src/test/org/apache/hadoop/hive/hbase/TestHBaseSerDe.java e821282
>
> Diff: https://reviews.apache.org/r/9276/diff/
>
>
> Testing
> -------
>
> Added unit tests to demonstrate the new functionality. Also made sure that all existing unit tests passed.
>
>
> Thanks,
>
> Swarnim Kulkarni
>
>

+
Brock Noland 2013-02-09, 16:05
+
Swarnim Kulkarni 2013-02-09, 16:29
+
Mark Grover 2013-02-09, 16:38
+
Swarnim Kulkarni 2013-02-09, 21:56
+
Eric Hanson 2013-04-30, 21:35