Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # user >> WALOG Design

First, thank you all for the responses on my BatchWriter question, as I was able to increase my ingestion rate by a large factor. I am now hitting disk i/o limits, which is forcing me to look at reducing file copying. My primary thoughts concerning this are reducing the hadoop replication factor as well as reducing the number of major compactions.
However, from what I understand about write ahead logs (in 1.4), even if you remove all major compactions, all data will essentially be written to disk twice: once to the WALOG in the local directory (HDFS is 1.5), then from the WALOG to an RFile on HDFS. Is this understanding correct?
I'm trying to understand what the primary reasons are for having the WALOG.
Is there any way to write directly to an RFile from the In-Memory Map (or have the WALOG in memory)?