Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> Primary Key Design


Copy link to this message
-
Primary Key Design
Hi,
I am starting to use following scheme for primary keys:
SHA256(URL) + "-RAW" Primary Key Schema
<https://outsideiq.jira.com/browse/CA-107>

RATIONALE:
* PKs  in Lily (user-defined) will be prepended "USER." and I can't use URI
for instance (it contains dots which is special character in current
version)
* Additionally to SHA-256-generated PK, Lily will still use UUID (which is
really unique) for versioningŠ
* IMPORTANT: we need randomize Pks; it is best practice with Hbase (data
will be randomly distributed in a cluster)

and I suggest to use similar SHA256(JSON-Object-in-UTF8) + "-OIQ" (it is
postfix so that we will have good "randomization"; in Hbase, all data is
physically sorted by PK)
- since all OIQ objects will be stored denormalized as JSON (string type
Lily) (note, it will be UTF-8 encoded, I believe it is also part of
ECMA-specs)
/**

 * {@link
http://stackoverflow.com/questions/221165/pros-and-cons-of-using-md5-hash-of
-uri-as-the-primary-key-in-a-database}

 *

 * @author Fuad

 *

 */

public class SHA256 {

public static final String SHA256(byte[] bytes) throws
NoSuchAlgorithmException {

MessageDigest md = MessageDigest.getInstance("SHA-256");

md.update(bytes);

byte[] mdbytes = md.digest();

// convert the byte to hex format

StringBuffer hexString = new StringBuffer();

for (int i = 0; i < mdbytes.length; i++) {

String hex = Integer.toHexString(0xff & mdbytes[i]);

if (hex.length() == 1)

hexString.append('0');

hexString.append(hex);

}

return hexString.toString();

}

public static final String SHA256(String text) throws
NoSuchAlgorithmException, UnsupportedEncodingException  {

return SHA256(text.getBytes("UTF-8"));

}

}

--
Fuad Efendi

NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB