Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Hive performance vs. SQL?


Copy link to this message
-
Hive performance vs. SQL?
I haven't had an opportunity to set up a huge Hive database yet because exporting csv files from our SQL database is, in itself, a rather laborious task.  I was just curious how I might expect Hive to perform vs. SQL on large databases and large queries?  I realize Hive is pretty "latent" since it builds and runs MapReduce jobs for even the simplest queries, but that is precisely why I think it might perform better on long queries against large (external CSV) databases).

Would you expect Hive to ever outperform SQL on a single machine (standalone or pseudo-distributed mode)?  I am entirely open to the possibility that the answer is no, that Hive could never compete with SQL in a single machine.  Is this true?

If so, how large (how parallel) do you think the underlying Hadoop cluster needs to be before Hive overtakes SQL?  2X?  10X?  Where is the crossover point where Hive actually outperforms SQL?

Along similar lines, might Hive never outperform SQL on a database small enough for SQL to run on a single machine, a 10s to 100s of GBs?  Must the database itself be so large that SQL is effectively crippled and the data must be distributed before Hive offer significant gains?

I am really just trying to get a basic feel for how I might anticipate's Hive's behavior vs. SQL once I get a large system up and running.

Thanks.

________________________________________________________________________________
Keith Wiley     [EMAIL PROTECTED]     keithwiley.com    music.keithwiley.com

"I used to be with it, but then they changed what it was.  Now, what I'm with
isn't it, and what's it seems weird and scary to me."
                                           --  Abe (Grandpa) Simpson
________________________________________________________________________________
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB