Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Accumulo, mail # dev - Bug In Text Class - getLength() and getBytes().length are different.


+
David Medinets 2012-11-13, 18:00
+
Marc Parisi 2012-11-13, 18:13
+
Keith Turner 2012-11-13, 18:13
+
John Vines 2012-11-13, 18:23
Copy link to this message
-
Re: Bug In Text Class - getLength() and getBytes().length are different.
David Medinets 2012-11-13, 18:50
That makes sense. I have resolved my issue by passing the
Text.getLength() value along with Text.getBytes(). This works fine.
Thanks.

On Tue, Nov 13, 2012 at 1:23 PM, John Vines <[EMAIL PROTECTED]> wrote:
> This is not a bug. A Text object is a reusable object which prevents
> repeated creation of byte arrays, so it will use the same byte array,
> resizing it if necessary, and writing over the previous values. Doing any
> operation based on Text.getBytes().length has a strong potential to provide
> inaccurate results. Text.getLength() is the appropriate way to get the
> length of the underlying byte array that you care about.
>
> John
>
>
> On Tue, Nov 13, 2012 at 1:00 PM, David Medinets <[EMAIL PROTECTED]>wrote:
>
>> The following code (the TextTest class) displays:
>>
>> cq: [5000000000000000]
>> cq: [16]
>> cq: [16]
>> cq: [5000000000000000]
>> cq: [16]
>> cq: [17]
>>
>> You'll notice that the last two numbers are different, but they should
>> both be 16. This bug affects Accumulo because of the following code in
>> Mutation:
>>
>>   private void put(byte b[]) {
>>     buffer.writeVLong(b.length);
>>     buffer.add(b, 0, b.length);
>>   }
>>
>>   private void put(Text t) {
>>     buffer.writeVLong(t.getLength());
>>     buffer.add(t.getBytes(), 0, t.getLength());
>>   }
>>
>> I should be able to call either of the following to get the same
>> result but I can't.
>>
>>   put("5000000000000000".getBytes());
>>   put(new Text("5000000000000000"));
>>
>> Has anyone else run into this issue? Any workarounds or fixes?
>>
>> ----
>>
>> package com.codebits.accumulo;
>>
>> import org.apache.hadoop.io.Text;
>>
>> public class TextTest {
>>
>>   public static void main(String[] args) {
>>     String s = "5000000000000000";
>>     System.out.println("cq: [" + s + "]");
>>     System.out.println("cq: [" + s.length() + "]");
>>     System.out.println("cq: [" + s.getBytes().length + "]");
>>
>>     Text cq = new Text(s);
>>     System.out.println("cq: [" + cq + "]");
>>     System.out.println("cq: [" + cq.getLength() + "]");
>>     System.out.println("cq: [" + cq.getBytes().length + "]");
>>   }
>>
>> }
>>