If you have more than one region, might be useful. MR will scan all the regions in parallel. If you do a full scan from a client API with no parallelism, then the MR job might be faster. But it will take more resources on the cluster and might impact the SLA of the other clients, if any,
JM 2014-04-14 2:42 GMT-04:00 Mohammad Tariq <[EMAIL PROTECTED]>:
I need to get about 20,000 rows from the table. the table is about 1,000,000 rows. my first version is using 20,000 Get and I found it's very slow. So I modified it to a scan and filter unrelated rows in the client. maybe I should write a coprocessor. btw, is there any filter available for me? something like sql statement where rowkey in('abc', 'abd' ....). a very long in statement
On Mon, Apr 14, 2014 at 7:46 PM, Jean-Marc Spaggiari <[EMAIL PROTECTED]> wrote:
On Tue, Apr 15, 2014 at 3:39 AM, Doug Meil <[EMAIL PROTECTED]> wrote:
NEW: Monitor These Apps!
Apache Lucene, Apache Solr and all other Apache Software Foundation project and their respective logos are trademarks of the Apache Software Foundation.
Elasticsearch, Kibana, Logstash, and Beats are trademarks of Elasticsearch BV, registered in the U.S. and in other countries. This site and Sematext Group is in no way affiliated with Elasticsearch BV.
Service operated by Sematext