Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HDFS >> mail # user >> Re: Wrapping around BitSet with the Writable interface


Copy link to this message
-
Re: Wrapping around BitSet with the Writable interface
Another interesting alternative is the EWAH implementation of java bitsets
that allow efficient compressed bitsets with very fast OR operations.

https://github.com/lemire/javaewah

See also https://code.google.com/p/sparsebitmap/ by the same authors.
On Sun, May 12, 2013 at 1:11 PM, Bertrand Dechoux <[EMAIL PROTECTED]>wrote:

> In order to make the code more readable, you could start by using the
> methods toByteArray() and valueOf(bytes)
>
>
> http://docs.oracle.com/javase/7/docs/api/java/util/BitSet.html#toByteArray%28%29
>
> http://docs.oracle.com/javase/7/docs/api/java/util/BitSet.html#valueOf%28byte[]%29
>
> Regards
>
> Bertrand
>
>
> On Sun, May 12, 2013 at 8:24 PM, Jim Twensky <[EMAIL PROTECTED]>wrote:
>
>> I have large java.util.BitSet objects that I want to bitwise-OR using a
>> MapReduce job. I decided to wrap around each object using the Writable
>> interface. Right now I convert each BitSet to a byte array and serialize
>> the byte array on disk.
>>
>> Converting them to byte arrays is a bit inefficient but I could not find
>> a work around to write them directly to the DataOutput. Is there a way to
>> skip this and serialize the object directly? Here is what my current
>> implementation looks like:
>>
>> public class BitSetWritable implements Writable {
>>
>>   private BitSet bs;
>>
>>   public BitSetWritable() {
>>     this.bs = new BitSet();
>>   }
>>
>>   @Override
>>   public void write(DataOutput out) throws IOException {
>>
>>     ByteArrayOutputStream bos = new ByteArrayOutputStream(bs.size()/8);
>>     ObjectOutputStream oos = new ObjectOutputStream(bos);
>>     oos.writeObject(bs);
>>     byte[] bytes = bos.toByteArray();
>>     oos.close();
>>     out.writeInt(bytes.length);
>>     out.write(bytes);
>>
>>   }
>>
>>   @Override
>>   public void readFields(DataInput in) throws IOException {
>>
>>     int len = in.readInt();
>>     byte[] bytes = new byte[len];
>>     in.readFully(bytes);
>>
>>     ByteArrayInputStream bis = new ByteArrayInputStream(bytes);
>>     ObjectInputStream ois = new ObjectInputStream(bis);
>>     try {
>>       bs = (BitSet) ois.readObject();
>>     } catch (ClassNotFoundException e) {
>>       throw new IOException(e);
>>     }
>>
>>     ois.close();
>>   }
>>
>> }
>>
>
>
>
> --
> Bertrand Dechoux
>