Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Hadoop, mail # user - what is the code for WritableComparator.readVInt and WritableUtils.decodeVIntSize doing?


Copy link to this message
-
what is the code for WritableComparator.readVInt and WritableUtils.decodeVIntSize doing?
Jane Wayne 2012-03-31, 04:38
in tom white's book, Hadoop, The Definitive Guide, in the second edition,
on page 99, he shows how to compare the raw bytes of a key with Text
fields. he shows an example like the following.

int firstL1 = WritableUtils.decodeVIntSize(b1[s1]) + readVInt(b1, s1);
int firstL2 = WritableUtils.decodeVIntSize(b2[s2]) + readVInt(b2, s2);

his explanation is that firstL1 is the length of the first String/Text in
b1, and firstL2 is the length of the first String/Text in b2. but i'm
unsure of what the code is actually doing.

what is WritableUtils.decodeVIntSize(...) doing?
what is WritableComparator.readVInt(...) doing?
why do we have to add the outputs of these 2 methods to get the length of
the String/Text?

could someone please explain in plain terms what's happening here? it seems
WritableComparator.readVInt(...) is already getting the length of the
byte[] corresponding to the string. it seems
WritableUtils.decodeVIntSize(...) is also doing the same thing (from
reading the javadoc).

when i look at WritableUtils.writeString(...), two things happen. the
length of the byte[] is written, followed by writing the byte[] itself. why
can't we simply do something like the following to get the length?

int firstL1 = readInt(b1[s1]);
int firstL2 = readInt(b2[s2]);
+
Chris White 2012-03-31, 17:17
+
Jane Wayne 2012-04-01, 03:19