Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Accumulo >> mail # dev >> Bug In Text Class - getLength() and getBytes().length are different.


+
David Medinets 2012-11-13, 18:00
+
Marc Parisi 2012-11-13, 18:13
+
Keith Turner 2012-11-13, 18:13
+
John Vines 2012-11-13, 18:23
Copy link to this message
-
Re: Bug In Text Class - getLength() and getBytes().length are different.
That makes sense. I have resolved my issue by passing the
Text.getLength() value along with Text.getBytes(). This works fine.
Thanks.

On Tue, Nov 13, 2012 at 1:23 PM, John Vines <[EMAIL PROTECTED]> wrote:
> This is not a bug. A Text object is a reusable object which prevents
> repeated creation of byte arrays, so it will use the same byte array,
> resizing it if necessary, and writing over the previous values. Doing any
> operation based on Text.getBytes().length has a strong potential to provide
> inaccurate results. Text.getLength() is the appropriate way to get the
> length of the underlying byte array that you care about.
>
> John
>
>
> On Tue, Nov 13, 2012 at 1:00 PM, David Medinets <[EMAIL PROTECTED]>wrote:
>
>> The following code (the TextTest class) displays:
>>
>> cq: [5000000000000000]
>> cq: [16]
>> cq: [16]
>> cq: [5000000000000000]
>> cq: [16]
>> cq: [17]
>>
>> You'll notice that the last two numbers are different, but they should
>> both be 16. This bug affects Accumulo because of the following code in
>> Mutation:
>>
>>   private void put(byte b[]) {
>>     buffer.writeVLong(b.length);
>>     buffer.add(b, 0, b.length);
>>   }
>>
>>   private void put(Text t) {
>>     buffer.writeVLong(t.getLength());
>>     buffer.add(t.getBytes(), 0, t.getLength());
>>   }
>>
>> I should be able to call either of the following to get the same
>> result but I can't.
>>
>>   put("5000000000000000".getBytes());
>>   put(new Text("5000000000000000"));
>>
>> Has anyone else run into this issue? Any workarounds or fixes?
>>
>> ----
>>
>> package com.codebits.accumulo;
>>
>> import org.apache.hadoop.io.Text;
>>
>> public class TextTest {
>>
>>   public static void main(String[] args) {
>>     String s = "5000000000000000";
>>     System.out.println("cq: [" + s + "]");
>>     System.out.println("cq: [" + s.length() + "]");
>>     System.out.println("cq: [" + s.getBytes().length + "]");
>>
>>     Text cq = new Text(s);
>>     System.out.println("cq: [" + cq + "]");
>>     System.out.println("cq: [" + cq.getLength() + "]");
>>     System.out.println("cq: [" + cq.getBytes().length + "]");
>>   }
>>
>> }
>>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB