John Omernik 2012-04-04, 13:40
I did a test today:
select * from table where 'hello' like '%el%' limit 10;
I got 10 rows from my table
select * from table where 'HELLO' like '%el%' limit 10;
I got 0 rows from the same table.
Based on this, it would appear the like statement is case sensitive.
Coming from other RDBMs, this would be counter to the common knowledge.
(I've included a piece from the mysql documentation below). While I know
hive ISN'T a RDBMS, it is trying to ease the transition into using bigdata
for people who know SQL. This is a commonly used aspect of RDBMS and for
someone moving to hive, they may write this statement, get no results and
assume there are no results that match. This is not clear in the hive-wiki
documentation, nor is it something that a reasonable user should be
"expected" to know or even test prior to using hive. I think the like
statement should be changed to be case-insensitive to match it's function
in other DBMS Thoughts?
Mysql Documentation Reference
The following two statements illustrate that string comparisons are not
case sensitive unless one of the operands is a binary string:
mysql> *SELECT 'abc' LIKE 'ABC';*
mysql> *SELECT 'abc' LIKE BINARY 'ABC';*