For #1, the two regions would contain roughly half the data.

For #2, 1 region would not receive new data. As you see, such schema design is suboptimal.

For #3, you can split the key space evenly. Using number of region servers as number of splits is Okay.

Cheers

On Jul 16, 2014, at 12:25 AM, Shushant Arora <[EMAIL PROTECTED]> wrote:
 
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB