Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> Encoding when using Bytes.toBytes(String)?


+
Steinmaurer Thomas 2011-07-26, 13:37
Copy link to this message
-
Re: Encoding when using Bytes.toBytes(String)?
Bytes.toBytes(String) encodes using UTF-8 [1]. If all of your
characters are ASCII, then you'll use only one byte per character. I
think some ANSI characters will map to multibyte characters in UTF-8.

-Joey

[1] http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/util/Bytes.html#toBytes(java.lang.String)

On Tue, Jul 26, 2011 at 6:37 AM, Steinmaurer Thomas
<[EMAIL PROTECTED]> wrote:
> Hello,
>
>
>
> we are currently running tests in respect to disk space usage when
> inserting records into our table. Just want to be sure, if
> Bytes.toBytes(String) encodes a character with 2 bytes (Unicode)?
>
>
>
> As we only have ANSI characters in the rowkey (~ 48 characters) and
> qualifier values, I wonder if we could save disk space by converting
> stuff to an Ansi-String before sending it to the server?
>
>
>
> Thanks,
>
> Thomas
>
>
>
>

--
Joseph Echeverria
Cloudera, Inc.
443.305.9434
+
Steinmaurer Thomas 2011-07-27, 06:07
+
Joey Echeverria 2011-07-27, 13:00
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB