It has a _direct_ relation. The region servers use a write-ahead-log
that can be replayed when they fail, but HDFS cannot guarantee that
what we put in there is really flushed to the datanodes so if the RS
process dies for any reason, the file won't be closed and data will be
lost.
J-D
On Mon, Aug 2, 2010 at 10:50 AM, Vincent Barat <[EMAIL PROTECTED]> wrote:
> My issue is not related to an HDFS failure but to a regionserver process
> crash (just the hbase process, not the whole machine).
> Do you think that it can yet have a relation with this stuff ?
>
> Le 02/08/10 19:22, Jean-Daniel Cryans a écrit :
>>
>> HDFS started supporting fsSync in the 0.20-append branch (no release
>> yet) and 0.21.0 so data loss is expect in 0.20 (e.g. latest puts
>> aren't durable). See Todd's presentation for more background
>> information (starts at slide #16):
>>
http://www.cloudera.com/blog/2010/03/hbase-user-group-9-hbase-and-hdfs/>>
>> If you which to use a "durable" hbase, you can use the latest HBase
>> 0.89 (available on the website) along with a snapshot of Hadoop's
>> 0.20-append branch. Alternatively, you can also use Cloudera's CDH3b2
>> which has both (I don't work for them, but it's probably just easier
>> to checkout at the moment).
>>
>> J-D
>>
>> On Mon, Aug 2, 2010 at 10:12 AM, Vincent Barat<[EMAIL PROTECTED]> wrote:
>>>
>>> Hi,
>>>
>>> I have a simple Java program that write data into a set of HBase tables
>>> using the HTable().put() call and an infinite number of retries (in order
>>> to
>>> block when HBase fails and restart when it is up again, and thus guaranty
>>> that my data are written sooner or later).
>>>
>>> My cluster is a test cluster of 2 regionservers running HBase 0.20.3.
>>>
>>> During one (1) regionserver failure, I experienced the following issue:
>>> all
>>> the data I write are lost, with no exception and no error reported (the
>>> call
>>> act as if everything was ok).
>>>
>>> If I shut both regionservers down, I got my exception and errors and my
>>> code
>>> work fine (it blocks and restarts when hbase is up again).
>>>
>>> So my question is: is it a known problem ? Isn't HTable().put() supposed
>>> to
>>> guaranty that the data are correctly written when it returns with no
>>> failure
>>> ?
>>>
>>> Regards,
>>>
>>>
>