|
|
+
Steinmaurer Thomas 2011-07-26, 13:37
-
Re: Encoding when using Bytes.toBytes(String)?Joey Echeverria 2011-07-26, 16:35
Bytes.toBytes(String) encodes using UTF-8 [1]. If all of your
characters are ASCII, then you'll use only one byte per character. I think some ANSI characters will map to multibyte characters in UTF-8. -Joey [1] http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/util/Bytes.html#toBytes(java.lang.String) On Tue, Jul 26, 2011 at 6:37 AM, Steinmaurer Thomas <[EMAIL PROTECTED]> wrote: > Hello, > > > > we are currently running tests in respect to disk space usage when > inserting records into our table. Just want to be sure, if > Bytes.toBytes(String) encodes a character with 2 bytes (Unicode)? > > > > As we only have ANSI characters in the rowkey (~ 48 characters) and > qualifier values, I wonder if we could save disk space by converting > stuff to an Ansi-String before sending it to the server? > > > > Thanks, > > Thomas > > > > -- Joseph Echeverria Cloudera, Inc. 443.305.9434 +
Steinmaurer Thomas 2011-07-27, 06:07
+
Joey Echeverria 2011-07-27, 13:00
|