Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Understanding HBase log output regarding memstore flush


Copy link to this message
-
Re: Understanding HBase log output regarding memstore flush
Hi Asaf,

I believe HDFS will see the 35.5MB worth of data.  The 132.7MB is the size
of the data in the memstore with the overhead of the ConcurrentSkipListMap
which is a pointer-heavy data structure.

Are you using compression?  If so then the 35.5 is the compressed size, and
you should see a metric on the regionserver's rs-status page with
the storefileUncompressedSizeMB.

Let's say you are using compression and
your storefileUncompressedSizeMB=70MB, then:

* start with 132.7MB in the memstore
* the memstore flusher will rewrite the data into blocks (totaling 70MB),
compress them, and write 35.5MB of data to HDFS
* when reading the blocks back, they will be uncompressed and stored in the
block cache with total size of 70MB

Matt
On Thu, Jun 28, 2012 at 10:17 AM, Asaf Mesika <[EMAIL PROTECTED]> wrote:

> Hi,
>
> I'm trying to figure out some discrepancies I'm witnessing in the HBase
> Region Server log file.
>
> It states that a flush was requested, and then a memstore flush is started.
> It says the flush size, after snapshotting is 139105600 (~132.7m).
> In the log message below, the file size of the file the memstore was
> flushed too is not the same size (132.7m):
>
> [Quote]
> Added
>
> hdfs://dror.foo.local:8020/hbase/food_logs/64fd0d8da5714f03eb67f7f788a99960/data/8aa2974bdd7c4222a783b9b1558f9915,
> entries=214652, sequenceid=1068343, filesize=*35.5m*
> [/Quote]
>
> I'm curios to this difference in size, since I eventually would like to
> know the HDFS write throughput HBase is experiencing while flushing
> memstore to disk.
>
>
> *Logs quote*
>
> 2012-06-25 16:23:11,905 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
> Flush requested on
> food_logs,,1340630544300.64fd0d8da5714f03eb67f7f788a99960.
> 2012-06-25 16:23:11,906 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
> Started memstore flush for
> food_logs,,1340630544300.64fd0d8da5714f03eb67f7f788a99960., current region
> memstore size 132.7m
> 2012-06-25 16:23:11,906 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
> Finished snapshotting
> food_logs,,1340630544300.64fd0d8da5714f03eb67f7f788a99960., commencing wait
> for mvcc, flushsize=139105600
> 2012-06-25 16:23:11,906 DEBUG org.apache.hadoop.hbase.regionserver.HRegion:
> Finished snapshotting, commencing flushing stores
> 2012-06-25 16:23:11,938 DEBUG org.apache.hadoop.hbase.util.FSUtils:
> Creating
>
> file:hdfs://dror.foo.local:8020/hbase/food_logs/64fd0d8da5714f03eb67f7f788a99960/.tmp/8aa2974bdd7c4222a783b9b1558f9915with
> permission:rwxrwxrwx
> 2012-06-25 16:23:11,960 DEBUG
> org.apache.hadoop.hbase.io.hfile.HFileWriterV2: Initialized with
> CacheConfig:enabled [cacheDataOnRead=true] [cacheDataOnWrite=false]
> [cacheIndexesOnWrite=false] [cacheBloomsOnWrite=false]
> [cacheEvictOnClose=false] [cacheCompressed=false]
> 2012-06-25 16:23:11,960 INFO
> org.apache.hadoop.hbase.regionserver.StoreFile: Delete Family Bloom filter
> type for
>
> hdfs://dror.foo.local:8020/hbase/food_logs/64fd0d8da5714f03eb67f7f788a99960/.tmp/8aa2974bdd7c4222a783b9b1558f9915:
>  CompoundBloomFilterWriter
> 2012-06-25 16:23:12,218 DEBUG
> org.apache.hadoop.hbase.regionserver.LogRoller: HLog roll requested
> 2012-06-25 16:23:13,043 INFO
> org.apache.hadoop.hbase.regionserver.StoreFile: NO General Bloom and NO
> DeleteFamily was added to HFile (
>
> hdfs://dror.foo.local:8020/hbase/food_logs/64fd0d8da5714f03eb67f7f788a99960/.tmp/8aa2974bdd7c4222a783b9b1558f9915
> )
> 2012-06-25 16:23:13,043 INFO org.apache.hadoop.hbase.regionserver.Store:
> Flushed , sequenceid=1068343, memsize=132.7m, into tmp file
>
> hdfs://dror.foo.local:8020/hbase/food_logs/64fd0d8da5714f03eb67f7f788a99960/.tmp/8aa2974bdd7c4222a783b9b1558f9915
> 2012-06-25 16:23:13,050 DEBUG org.apache.hadoop.hbase.regionserver.Store:
> Renaming flushed file at
>
> hdfs://dror.foo.local:8020/hbase/food_logs/64fd0d8da5714f03eb67f7f788a99960/.tmp/8aa2974bdd7c4222a783b9b1558f9915
>  to
>
> hdfs://dror.foo.local:8020/hbase/food_logs/64fd0d8da5714f03eb67f7f788a99960/data/8aa2974bdd7c4222a783b9b1558f9915
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB