Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Accumulo >> mail # dev >> peformance


+
Drew Pierce 2013-05-03, 16:39
+
Josh Elser 2013-05-03, 18:41
Copy link to this message
-
Re: peformance
Hey Drew,

This could be a very broad question, so I'll give a partial answer and
encourage you to come back for more details.

Impala is a mechanism that sits on top of HBase or HDFS that is design to
filter and process large quantities of data. People generally like Impala
because it supports a subset of SQL and because it is optimized to reduce
the latency that might be incurred by starting up a job in a bulk
synchronous processing framework. Instead, it uses a series of daemon
processes and a custom API to reduce overhead.

With Accumulo, our approach to low-latency queries is generally to use a
table structure that incorporates some type of index. With appropriate
indexing techniques, Accumulo can achieve sub-second query latencies even
over multi-petabyte sized corpuses. Some of these table designs are
described in the manual:
http://accumulo.apache.org/1.4/user_manual/Table_Design.html

Regarding the SQL piece, Accumulo does not natively support an SQL
interface. For that you would need to wrap it in a processing framework,
like Hive (https://issues.apache.org/jira/browse/ACCUMULO-143). To make a
shameless plug, Sqrrl (www.sqrrl.com) also offers that functionality.

Cheers,
Adam

On Fri, May 3, 2013 at 12:39 PM, Drew Pierce <[EMAIL PROTECTED]> wrote:

> does anyone have any anecdotal results (nothing formal) for queries to
> speak to the likes of impala and near low-latency.
> Sent from my Android
>
> Sorry if brief
>
>
+
William Slacum 2013-05-03, 18:36
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB