To answer the first part, the batch writer will block until it
finishes flushing. The mutations are applied to the write ahead log
and an in memory map on the tserver. The write ahead logs are used for
failover, and the resulting keys are kept in the memory map until the
tserver flushes (minor compaction) or reorganizes its RFiles (major
compaction). To dig through the source, look at the
On Wed, May 16, 2012 at 1:32 PM, Sukant Hajra <[EMAIL PROTECTED]> wrote:
> Excerpts from Keith Turner's message of Wed May 16 02:42:17 -0500 2012:
>> After a call to flush() on a batchwriter returns, any mutations
>> written before the call to flush should be immediately visible.
> I don't want to belabor the point, but I just want to be sure I'm not
> interpreting your response too casually. From your response, I'm now under the
> impression that a flush blocks until the server sends back an acknowledgment
> that the mutation has been written to the log. Then all subsequent reads look
> not only at HDFS, but also the write logs to make sure they have the most
> consistent view? Is this the case? I appreciate the confirmation to save me a
> dig into the source code.
> If the reads are truly immediately consistent, has there ever been talk of
> making inconsistent reads for the sake of improving read times? Or is it all
> in the noise with respect to network speeds and not worth the effort?
> Also, if flush blocks waiting for an acknowledgment, I'm assuming that the
> writer will throw a MutationsRejectedException. If this happens, is the
> BatchWriter still usable? Or should I close it out and get a new one? The
> connector should be fine, though, right? I'm just trying to make sure I have
> my error handling logic sanely configured.
> Other than that, thanks a lot for your prompt responses. They really helped.