Thejas Nair 2013-07-03, 01:28
Edward Capriolo 2013-07-03, 01:52
Thejas Nair 2013-07-03, 02:19
Edward Capriolo 2013-07-03, 03:39
Thejas Nair 2013-07-03, 07:26
Edward Capriolo 2013-07-03, 19:25
Thejas Nair 2013-07-03, 19:39
I'm +1 for calling it Hive SQL. No one knows what HQL is when they see the initials. Hive Query Language? Hadoop Query Language? Harold's Query Language? I agree with Ed that we should be up front about what Hive is and isn't and about where it's going and where it isn't. Whenever people ask me if being fully SQL-92 or SQL-2003 compliant or whatever is a goal I always say no. There's stuff in those specs Hive probably will never do. But to me that doesn't mean it isn't SQL.
Apache Derby calls its access language SQL. Yet it doesn't support outer joins or tiny int or a number of other things Hive does. SQLite calls its access language SQL and it has similar restrictions.
People understand that every data store has different dialect of SQL. Hive's dialect is particularly crude in some respects (lacking some standard features and datatypes) and doing anything real requires concepts not known in other SQL dialects (like what SerDe do you want your table to use). Some of these we can address and some are a part of being on Hadoop.
One final analogy. When a child is learning a language they 1) don't know as many words as an adult does and often don't understand adult usage even when they know all the words; and 2) use made up/nonsense words. Yet no one says the child doesn't speak the language or speaks a different language. You just recognize that the child is growing and learning the language. How is Hive different? It is growing and adding more parts of SQL all the time.
On Jul 3, 2013, at 12:26 AM, Thejas Nair wrote:
> On Tue, Jul 2, 2013 at 8:39 PM, Edward Capriolo <[EMAIL PROTECTED]> wrote:
>> What is in a name? :)
>> "Which SQL feature you are talking about here, that forces single reducer
>> and hence should not be supported?"
>> Joining on anything besides = comes to mind
>> Pretty sure the query mentioned here will not work (without being
>> SELECT isbn, title, price
>> FROM Book
>> WHERE price < (SELECT AVG(price) FROM Book)
>> ORDER BY title;
> Don't you think hive should be supporting this ? Don't you think our
> users would want this ?
> You can do theta joins without using single reducer (cartesian product
> can be done in parallel). But that is besides the point. I don't
> expect hive to be 100% sql compliant. I don't see 100% sql compliance
> as a goal, but I see more SQL compliance as desirable.
> That is why I prefer the term Hive-SQL.
>> Hive-SQL looks like it is trying to convey the idea that hive supports
>> extensions like T-SQL http://en.wikipedia.org/wiki/Transact-SQL or PL/SQL.
> If I refert to something as Oracle-SQL or DB2-SQL, I think people
> understand that it is a Oracle or DB2 dialect of SQL that I refer to.
>> Lessons from my mother.
>> You can't be half a saint.
>> "considering how much other databases deviate from the standard -
>> http://troels.arvin.dk/db/rdbms/ . See how much deviation is there for
>> example in 'limit clause' or the data types supported (and details of
>> data type support) -"
>> If all your friends jumped off a bridge would you do it?
> My friends are very smart, if they jump of the bridge, there is
> probably a very good reason to do so, and I would seriously consider
> I think hive has many smart friends like DB2, Oracle, teradata,
> vertica, impala, and even phoenix
> As you can see there is a wide range in SQL compliance across
> products. I don't see anything wrong in saying that hive is "SQL on
> I think I have conveyed everything I wanted to say on this topic. I
> will stop and listen to what others think before we go from half
> saints and jumping over the bridge to Hitler :)
> (http://en.wikipedia.org/wiki/Godwin's_law) (there I said it!!)
> I am looking forward to hearing if anybody else thinks calling it