Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Flushing to HDFS sooner

Copy link to this message
Re: Flushing to HDFS sooner
Manuel, do you have the WAL disabled? If not, theoretically, what "should" have happened here was that the WAL would have been synced to disk when the row was written (flush or no flush), and on restart the system should have replayed that WAL to rebuild the in-memory state of the regions that were lost on kill. So if everything seems to be correctly configured for durability, it's worth delving into this case to find out what happened. HBase makes the promise that exactly this kind of thing won't happen, so I'm sure folks would be interested to help debug it.

Did you kill *all* the data nodes? Was there anything of note in the logs? Can you repro this case consistently? Any chance you can try with 0.92 for comparison?


On Feb 19, 2012, at 6:45 AM, "Manuel de Ferran" <[EMAIL PROTECTED]> wrote:

> Greetings,
> on a testing platform (running HBase-0.90.3 on top of Hadoop-0.20-append),
> we did the following :
> - create a dummy table
> - put a single row
> - get this row from the shell
> - wait a few minutes
> - kill -9 the datanodes
> Because regionservers could not connect to datanodes, they shutdown.
> On restart, the row has vanished. But if we do the same and "flush 'dummy'"
> from the Shell before killing the datanodes, the row is still there.
> Is it related to WAL ? MemStores ? What happened ?
> What are the recommended settings so rows are auto-flushed or at least
> flushed more frequently ?
> Regards