It does so by splitting the row keys into ranges. The application controls the row keys, hence it can choose as row key whatever it like. If you prefix the row key with the prefix of the hash of the key you get hash partitioning.
________________________________ From: Nai Yan. <[EMAIL PROTECTED]> To: dev <[EMAIL PROTECTED]> Sent: Friday, August 15, 2014 9:46 PM Subject: May I ask why HBase choose to partition data by range?
Hello, May I ask why HBase chooses to partition data by range? Why not by Hash or list? I belive in the design phase of HBase, this should be discussed.
Partitioning by range allows for efficient range scans. Logically the ranges act like accessing a sorted list with a indexing hints.
Other systems that by default to hashing will not be able to efficiently scan though all its data sequentially. The nice thing with hbase though is that you can choose to hash your hbase row key and achieve efficient kv access by effectively converting it into a hash.
Hbase stared life as a google big table inspired system, and took many design cues from there.
On Sunday, August 17, 2014, 乃岩 <[EMAIL PROTECTED]> wrote:
Apache Lucene, Apache Solr and all other Apache Software Foundation project and their respective logos are trademarks of the Apache Software Foundation.
Elasticsearch, Kibana, Logstash, and Beats are trademarks of Elasticsearch BV, registered in the U.S. and in other countries. This site and Sematext Group is in no way affiliated with Elasticsearch BV.
Service operated by Sematext