Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Accumulo >> mail # dev >> Setting Charset in getBytes() call.


+
David Medinets 2012-10-28, 21:50
+
Ed Kohlwey 2012-10-28, 22:18
+
William Slacum 2012-10-29, 15:39
+
David Medinets 2012-10-29, 16:00
+
Josh Elser 2012-10-29, 16:21
+
Benson Margulies 2012-10-29, 16:24
+
John Vines 2012-10-29, 16:42
+
Josh Elser 2012-10-29, 16:57
+
David Medinets 2012-10-29, 17:00
+
William Slacum 2012-10-29, 17:13
+
Mike Drob 2012-10-29, 17:16
+
Michael Flester 2012-10-29, 19:14
+
John Vines 2012-10-29, 19:18
+
Benson Margulies 2012-10-29, 20:02
+
David Medinets 2012-10-29, 20:29
+
Michael Flester 2012-10-30, 00:27
+
Josh Elser 2012-10-30, 00:46
+
Benson Margulies 2012-10-30, 00:54
Copy link to this message
-
Re: Setting Charset in getBytes() call.
I'm saying that I don't know of anything in the core API which performs
a getBytes() on the data itself. Accumulo itself is agnostic dealing
only in byte[]. I think we're saying the same thing..

On 10/29/2012 8:54 PM, Benson Margulies wrote:
> On Mon, Oct 29, 2012 at 8:46 PM, Josh Elser <[EMAIL PROTECTED]> wrote:
>> +1 Mike.
>>
>> 1. It would be hard for me to believe Key/Value are ever handled internally
>> in terms of Strings, but, if such a case does exist, it would be extremely
>> prudent to fix.
>>
>> 2. FWIW, the Shell does use ISO-8859-1 as its charset which is referenced by
>> other commands [1,2]. It would be good to double check all of the other
>> commands.
>
> I'm a bit lost. Any possible Java String can be rendered in UTF-8. So,
> if you are calling String.getBytes to turn a string into some bytes
> for some purpose, I think you need UTF-8.
>
> On the other hand, as Mike pointed out, new String(somebytes, "utf-8")
> will destroy data for some byte values that are not, in fact, UTF-8.
> By why would Accumulo ever need to string-ify some array of bytes of
> uncertain parentage?
>
>
>>
>> [1]
>> https://github.com/apache/accumulo/blob/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/Shell.java
>> [2]
>> https://github.com/apache/accumulo/blob/trunk/core/src/main/java/org/apache/accumulo/core/util/shell/commands/InsertCommand.java
>>
>>
>> On 10/29/2012 8:27 PM, Michael Flester wrote:
>>>
>>> I agree with Benson entirely with one caveat. It seems to me that there
>>> might be two categories of things being discussed
>>>
>>>     1. User data (keys and values)
>>>     2. Ancillary things needed for operation of Accumulo (passwords).
>>>
>>> These could well be considered separately. Trying to do anything with
>>> keys and values other than treating them as bytes all of the time
>>> I find quite scary.
>>>
>>> And if this is only being done to satisfy pmd or findbugs, those tools
>>> can be convinced to modify their reporting about this issue.
>>>
>>
+
John Vines 2012-10-30, 02:08
+
David Medinets 2012-10-30, 02:47
+
Josh Elser 2012-10-30, 22:27
+
David Medinets 2012-10-30, 23:47
+
Josh Elser 2012-10-31, 00:21
+
Benson Margulies 2012-10-31, 00:31
+
William Slacum 2012-10-31, 00:41
+
David Medinets 2012-10-31, 02:29
+
John Vines 2012-10-31, 02:35
+
Christopher Tubbs 2012-10-31, 18:02
+
Marc Parisi 2012-11-02, 12:24
+
Benson Margulies 2012-11-02, 19:56
+
John Vines 2012-11-02, 20:18
+
Christopher Tubbs 2012-11-03, 01:54
+
David Medinets 2012-11-03, 03:34
+
Josh Elser 2012-11-02, 23:34
+
Drew Farris 2012-10-30, 01:22
+
Adam Fuchs 2012-10-30, 20:26
+
Ed Kohlwey 2012-10-30, 01:44
+
Ed Kohlwey 2012-10-30, 01:54
+
Eric Newton 2012-10-30, 20:02
+
Marc Parisi 2012-10-30, 22:28
+
Marc Parisi 2012-10-30, 22:31
+
Benson Margulies 2012-10-30, 23:26