Thanks Wes for you help.

Based upon some code reading, I managed to code-up a basic working example.
The code is here:

However, I do have some questions about the concepts in Arrow

1. ArrowBlock is the unit of reading/writing. One ArrowBlock essentially is
the amount of the data one must hold in-memory at a time. Is my
understanding correct?

2. There are Base[Reade/Writer] interfaces as well as Mutator/Accessor
classes in the ValueVector interface - both are implemented by all
supported data types. What is the relationship between these two? or when
is one suppose to use one over other. I only use Mutator/Accessor classes
in my code.

3. What are the "safe" varient functions in the Mutator's code? I could not
understand what they meant to achieve.

4. What are MinorTypes?

5. For a writer, what is a dictionary provider? For example in the code, the reader is given as the dictionary provider for
the writer. But, is it something more than just:
DictionaryProvider.MapDictionaryProvider provider = new
ArrowFileWriter arrowWriter = new ArrowFileWriter(root, provider,

6. I am not clearly sure about the sequence of call that one needs to do
write on mutators. For example, if I code something like
NullableIntVector intVector = (NullableIntVector) fieldVector;
NullableIntVector.Mutator mutator = intVector.getMutator();
[.write num values]
then this works for primitive types, but not for VarBinary type. There I
have to set the capacity first,

NullableVarBinaryVector varBinaryVector = (NullableVarBinaryVector)
NullableVarBinaryVector.Mutator mutator = varBinaryVector.getMutator();

Example of these are here:
(writeField[???] functions).

Thank you very much,

On Thu, Dec 14, 2017 at 6:15 PM, Wes McKinney <[EMAIL PROTECTED]> wrote:
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB