Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # dev >> Puzzling behaviour with HBase checksums


Copy link to this message
-
Re: Puzzling behaviour with HBase checksums
Oh I never set that - what does it do, could that possibly be why this is
causing problems ?

THanks
Varun
On Fri, Jul 5, 2013 at 4:22 PM, Ted Yu <[EMAIL PROTECTED]> wrote:

> What value did you set for dfs.client.read.shortcircuit.skip.checksum ?
>
> Cheers
>
> On Fri, Jul 5, 2013 at 2:55 PM, Varun Sharma <[EMAIL PROTECTED]> wrote:
>
> > Hi,
> >
> > We are running hbase with hbase.regionserver.checksum.verify set to true.
> > But we are seeing an equal # of seeks for .meta files on HDFS and data
> > blocks. This is rather puzzling and I dont know if its broken. The hbase
> > jar is compiled against 2.0.3-alpha and this behaviour occurs for both
> > 0.94.3 and 0.94.7. Shortcircuit local reads is enabled is working well
> > since only the region server is accessing the disk.
> >
> > We run an strace limited to lseek calls and get the following:
> >
> > 28162 lseek(*668*, 0, SEEK_SET)           = 0
> > 28162 lseek(*635*, 57479463, SEEK_SET)    = 57479463
> > 28162 lseek(*2255*, 0, SEEK_SET)          = 0
> > 28162 lseek(*1938*, 29285843, SEEK_SET)   = 29285843
> >
> > Then we use lsof to find the underlying files and match them against the
> > corresponding file decriptors...
> >
> > java    27947 hbase * 668u *  REG             202,32   1048583 36176608
> >
> >
> /data/xvdc/hadoop/dfs/data/current/BP-1854623640-10.158.62.78-1363075060974/current/finalized/subdir54/
> > *blk_5081211948968918615_597521.meta*
> > *
> > *
> > java    27947 hbase  *635u*      REG             202,32 134217728
> 36176607
> >
> >
> /data/xvdc/hadoop/dfs/data/current/BP-1854623640-10.158.62.78-1363075060974/current/finalized/subdir54/
> > *blk_5081211948968918615*
> > *
> > *
> > java    27947 hbase *2255u*   REG             202,16    802375 32768850
> >
> >
> /mnt/hadoop/dfs/data/current/BP-1854623640-10.158.62.78-1363075060974/current/finalized/subdir40/
> > *blk_2670783290218647110_614641.meta*
> > *
> > *
> > java    27947 hbase *1938u*   REG             202,16 102702747 32768849
> >
> >
> /mnt/hadoop/dfs/data/current/BP-1854623640-10.158.62.78-1363075060974/current/finalized/subdir40/
> > *blk_2670783290218647110*
> >
> > The pattern in strace is pretty clear - first the .meta is read and then
> > the block is accessed. I am wondering if there are other places apart
> from
> > the checksum where the .meta file for the HDFS block is being accessed or
> > if the checksum stuff is simply broken ? It seems we are accessing 7 byte
> > values in these .meta files from more strace output. Is there a way I can
> > find out if the checksums were actually written out to HFiles in the
> first
> > place ?
> >
> > Thanks
> > Varun
> >
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB