Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # user >> File hash key case observation


Copy link to this message
-
Re: File hash key case observation
After ingesting a few million files using the method in the Accumulo File
System Archive (http://accumulo.apache.org/1.4/examples/dirlist.html) we
ran into a problem reading the information back out of Accumulo. I forget
the error but I resolved it by using DigestUtils.md5hex instead of
Digestutils.md5 which stored the md5 as hex string instead of a binary
value. We did not dig into what caused the error we just side-stepped it.
On Wed, Dec 4, 2013 at 11:37 PM, Chris Carrino <[EMAIL PROTECTED]>wrote:

> The org.apache.accumulo.examples.simple.filedata.FileDataIngest class
> generates LOWERCASE hash keys via the hexString() method, and uses them as
> row ID's for storing file chunks in Accumulo.  Note that NIST uses
> UPPERCASE hash keys in the Reference Data Set (RDS).  See
> http://www.nsrl.nist.gov/ for the RDS.  Both approaches are valid since
> the hexadecimal representation of the key is not case sensitive - but make
> sure you normalize to one case if you are comparing the keys generated in
> the FileDataIngest class to the RDS keys.
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB