Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo, mail # dev - Setting Charset in getBytes() call.


Copy link to this message
-
Re: Setting Charset in getBytes() call.
William Slacum 2012-10-29, 17:13
Since this only effects Strings, I'm even more inclined to leave the option
at the JVM. Most of our methods that accept a `CharSequence` or `String`
object end up creating a `Text` object based off them, which encodes them
with UTF-8. I'd much rather make it our convention to always convert
`String` to `Text` objects if we need to deal with them in a textual way;
otherwise we're just dealing with `byte[]` when serializing keys and
values.

Now, it's another story if Thrift is serializing `String`s with the JVM
setting...

On Mon, Oct 29, 2012 at 1:00 PM, David Medinets <[EMAIL PROTECTED]>wrote:

> > David, can you give some sort of feel for the usages of the getBytes()
> > calls? Since most of the API deals with things in terms of Text and
> byte[]
> > (Key and Value decomposed), are most of the usages
> configuration/user-input
> > based as your initial snippet from InputFormatBase showed?
>
> I will post a list of the files that I have changed before I commit. I
> will post the file list as a response in this thread.
>