Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Accumulo >> mail # user >> sanity checking application WALogs make sense


+
Sukant Hajra 2012-09-15, 05:44
Copy link to this message
-
Re: sanity checking application WALogs make sense
I'm a bit confused as to what you mean "if an iterator goes down
mid-processing." If it goes down at all, then whatever scope it's running
in- minor compaction, major compaction and scan- will most likely go down
as well (unless your iterator eats an exception and ignores errors). A
WALog shouldn't be deleted if whatever you were trying to do failed.

On Sat, Sep 15, 2012 at 1:44 AM, Sukant Hajra <[EMAIL PROTECTED]>wrote:

> Hi guys,
>
> We've been slowing inching towards using iterators more effectively.  The
> typical use case of indexed docs fit one of our needs and we wrote a
> prototype
> for it.
>
> We've recently realized that iterators are not just read-only, and that we
> can
> get more data-local functionality by taking advantage of their ability to
> mutate data as well.  We've only begun to think more of how this may
> assist us.
> A /lot/ of our critical data-accesses are slightly complex, but local to
> one
> row.  We have billions of entities in our system, so a simple bijection of
> entities to rows works our really well for us with respect to iterators.
>
> Up to this point, we've had an planned architecture that uses Kestrel for
> WALog
> and a messaging system like Akka pipelining work.  Akka would help us
> manage
> flowing work from the user to the log and from the log to orchestrations of
> Accumulo intra-row reads and writes.  The log just helps us get some faster
> response time without sacrificing too much reliability.
>
> Recently someone asked why use our own WALog when Accumulo has one
> natively in
> HDFS.  My response has been that Accumulo's WALog is at a lower level of
> granularity of mutations.  We want reliable orchestrations of mutations.
>  Our
> orchestrations are idempotent, but we want something long the lines of
> at-least-once delivery for the entire orchestration.  If an iterator goes
> down
> mid-processing, I fear Accumulo's native WALog is insufficient to claim we
> have
> a reliable enough system.
>
> I could definitely go through source code to validate this opinion, but I
> thought I'd bounce this reasoning off the list first.
>
> Also, I'm sure we're not the only people using Accumulo in this way.
>  Please
> feel to advise us if anyone's got other ideas for an architecture or feels
> we're thinking about the problem backwards.
>
> Thanks for your input,
> Sukant
>
+
Sukant Hajra 2012-09-15, 18:14
+
Billie Rinaldi 2012-09-17, 19:01