Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # dev >> Setting Charset in getBytes() call.


Copy link to this message
-
Re: Setting Charset in getBytes() call.
Since this only effects Strings, I'm even more inclined to leave the option
at the JVM. Most of our methods that accept a `CharSequence` or `String`
object end up creating a `Text` object based off them, which encodes them
with UTF-8. I'd much rather make it our convention to always convert
`String` to `Text` objects if we need to deal with them in a textual way;
otherwise we're just dealing with `byte[]` when serializing keys and
values.

Now, it's another story if Thrift is serializing `String`s with the JVM
setting...

On Mon, Oct 29, 2012 at 1:00 PM, David Medinets <[EMAIL PROTECTED]>wrote:

> > David, can you give some sort of feel for the usages of the getBytes()
> > calls? Since most of the API deals with things in terms of Text and
> byte[]
> > (Key and Value decomposed), are most of the usages
> configuration/user-input
> > based as your initial snippet from InputFormatBase showed?
>
> I will post a list of the files that I have changed before I commit. I
> will post the file list as a response in this thread.
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB