I am over here asking questions as I am a bit lost right now.
Over in Gora we have a real nice test suite called Goraci . Basically
the test runs many ingest clients that continually create linked lists
containing 25 million nodes. At some point the clients are stopped and a
map reduce job is run to ensure no linked list has a hole. A hole indicates
data was lost. Gennerally speaking the more nodes in the cluster, the
better a chance there is of us finding that data is lost.
Now for part two... in Gora we currently have datastore implementations for
Accumulo, Avro, Cassandra, HBase and Amazon Dynamodb... what we don not
have, is a mechanism to run the ingestion test against each datastore as a
controlled job meaning that we can subsequently gather metrics and infer
behaviour across Gora datastores.
I have not gone to our friends @Infra yet as I would rather do my homework
first and exhaust the avenues where I could contribute to getting this off
of the ground.
My questions are therefore very very simple... does anyone have an idea
about how we can get this working in tandem? Is this prime territory for
Thanks very much in advance.