|
|
-
When does HBase set modification timestamp of a HFile?
yun peng 2012-10-20, 20:36
Hi, All, I am trying to understand how and when hbase set the modification timestamp for hfiles. My original intention is to get a timestamp when a hfile is generated (when last write to a hfile in compaction). StoreFile.getModificationTime() looks a good candidate but after initial tests, it has some behaviour that confuses.
My test case is like, @hbase shell put 'usertable', "key1", 'cf:c1', "v1" put 'usertable', "key1", 'cf:c1', "v2" get 'usertable', 'key1', {COLUMN => 'cf:c1', VERSIONS => 4} flush 'usertable' major_compact 'usertable'
for the get operation, it echoes COLUMN CELL cf:c1 timestamp=1350764448150, value=v2 cf:c1 timestamp=1350764448114, value=v1
but when I try to get modification timestamp of the hfile generated by major_compact, it is 1350764448000, which is smaller/earlier than that of the keyvalues (which are actually put first). I have run the same test this couple of times, and it is not always: sometime modification timestamp is eariler sometimes it's later than keyvalue's.
Anyone knows how HBase set modification timestamp of hfile and that of a keyvalue pair? Or generally, how should I get a timestamp indicating when the last write to a hfile occurs?
regards, Yun
-
Re: When does HBase set modification timestamp of a HFile?
Stack 2012-10-20, 23:40
On Sat, Oct 20, 2012 at 1:36 PM, yun peng <[EMAIL PROTECTED]> wrote: > I am trying to understand how and when hbase set the modification > timestamp for hfiles. My original intention is to get a timestamp when > a hfile is generated (when last write to a hfile in compaction). > StoreFile.getModificationTime() looks a good candidate but after > initial tests, it has some behaviour that confuses. > > My test case is like, > @hbase shell > put 'usertable', "key1", 'cf:c1', "v1" > put 'usertable', "key1", 'cf:c1', "v2" > get 'usertable', 'key1', {COLUMN => 'cf:c1', VERSIONS => 4} > flush 'usertable' > major_compact 'usertable' > > for the get operation, it echoes > COLUMN CELL > cf:c1 timestamp=1350764448150, value=v2 > cf:c1 timestamp=1350764448114, value=v1 > > but when I try to get modification timestamp of the hfile generated by > major_compact, it is 1350764448000, which is smaller/earlier than that > of the keyvalues (which are actually put first). I have run the same > test this couple of times, and it is not always: sometime modification > timestamp is eariler sometimes it's later than keyvalue's. > > Anyone knows how HBase set modification timestamp of hfile and that of > a keyvalue pair? Or generally, how should I get a timestamp indicating > when the last write to a hfile occurs? >
This is an interesting project. Is this an hbase running on a distributed hdfs? Is the mod time set when file is opened? I've not dug in but should be easy enough figuring when the file gets its mod time. If you see variety in how its being set, that would be particularly interesting. What versions of hbase+hdfs?
Thanks, St.Ack
-
Re: When does HBase set modification timestamp of a HFile?
lohit 2012-10-20, 23:54
Last 3 digits of the timestamp you got from fetching getModificationTime() has all zeros 1350764448000 Do you get all zeros all the time? May be the time precision is not what you are looking for here?
2012/10/20 yun peng <[EMAIL PROTECTED]>
> Hi, All, > I am trying to understand how and when hbase set the modification > timestamp for hfiles. My original intention is to get a timestamp when > a hfile is generated (when last write to a hfile in compaction). > StoreFile.getModificationTime() looks a good candidate but after > initial tests, it has some behaviour that confuses. > > My test case is like, > @hbase shell > put 'usertable', "key1", 'cf:c1', "v1" > put 'usertable', "key1", 'cf:c1', "v2" > get 'usertable', 'key1', {COLUMN => 'cf:c1', VERSIONS => 4} > flush 'usertable' > major_compact 'usertable' > > for the get operation, it echoes > COLUMN CELL > cf:c1 timestamp=1350764448150, value=v2 > cf:c1 timestamp=1350764448114, value=v1 > > but when I try to get modification timestamp of the hfile generated by > major_compact, it is 1350764448000, which is smaller/earlier than that > of the keyvalues (which are actually put first). I have run the same > test this couple of times, and it is not always: sometime modification > timestamp is eariler sometimes it's later than keyvalue's. > > Anyone knows how HBase set modification timestamp of hfile and that of > a keyvalue pair? Or generally, how should I get a timestamp indicating > when the last write to a hfile occurs? > > regards, > Yun >
-- Have a Nice Day! Lohit
-
Re: When does HBase set modification timestamp of a HFile?
yun peng 2012-10-21, 01:42
Hi lohit and Stack, You are right, it always ends with 000. My setup in the test is hbase 0.94.2 running on local vfs. Is this tied with current hbase implementation or something to do with my particular setup (so I can re-configure to make it more precise)? Thanks, Yun
On Sat, Oct 20, 2012 at 7:54 PM, lohit <[EMAIL PROTECTED]> wrote: > Last 3 digits of the timestamp you got from fetching getModificationTime() > has all zeros 1350764448000 > Do you get all zeros all the time? May be the time precision is not what > you are looking for here? > > 2012/10/20 yun peng <[EMAIL PROTECTED]> > >> Hi, All, >> I am trying to understand how and when hbase set the modification >> timestamp for hfiles. My original intention is to get a timestamp when >> a hfile is generated (when last write to a hfile in compaction). >> StoreFile.getModificationTime() looks a good candidate but after >> initial tests, it has some behaviour that confuses. >> >> My test case is like, >> @hbase shell >> put 'usertable', "key1", 'cf:c1', "v1" >> put 'usertable', "key1", 'cf:c1', "v2" >> get 'usertable', 'key1', {COLUMN => 'cf:c1', VERSIONS => 4} >> flush 'usertable' >> major_compact 'usertable' >> >> for the get operation, it echoes >> COLUMN CELL >> cf:c1 timestamp=1350764448150, value=v2 >> cf:c1 timestamp=1350764448114, value=v1 >> >> but when I try to get modification timestamp of the hfile generated by >> major_compact, it is 1350764448000, which is smaller/earlier than that >> of the keyvalues (which are actually put first). I have run the same >> test this couple of times, and it is not always: sometime modification >> timestamp is eariler sometimes it's later than keyvalue's. >> >> Anyone knows how HBase set modification timestamp of hfile and that of >> a keyvalue pair? Or generally, how should I get a timestamp indicating >> when the last write to a hfile occurs? >> >> regards, >> Yun >> > > > > -- > Have a Nice Day! > Lohit
-
Re: When does HBase set modification timestamp of a HFile?
lohit 2012-10-21, 04:45
Looks like StoreFile.getModificationTime is returning you in seconds precision. I am not sure what other APIs could be used here, but if you know the location of StoreFile try one of the FileSystem APIs http://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/FileStatus.html#getModificationTime()2012/10/20 yun peng <[EMAIL PROTECTED]> > Hi lohit and Stack, > You are right, it always ends with 000. My setup in the test is hbase > 0.94.2 running on local vfs. Is this tied with current hbase > implementation or something to do with my particular setup (so I can > re-configure to make it more precise)? > Thanks, > Yun > > On Sat, Oct 20, 2012 at 7:54 PM, lohit <[EMAIL PROTECTED]> wrote: > > Last 3 digits of the timestamp you got from fetching > getModificationTime() > > has all zeros 1350764448000 > > Do you get all zeros all the time? May be the time precision is not what > > you are looking for here? > > > > 2012/10/20 yun peng <[EMAIL PROTECTED]> > > > >> Hi, All, > >> I am trying to understand how and when hbase set the modification > >> timestamp for hfiles. My original intention is to get a timestamp when > >> a hfile is generated (when last write to a hfile in compaction). > >> StoreFile.getModificationTime() looks a good candidate but after > >> initial tests, it has some behaviour that confuses. > >> > >> My test case is like, > >> @hbase shell > >> put 'usertable', "key1", 'cf:c1', "v1" > >> put 'usertable', "key1", 'cf:c1', "v2" > >> get 'usertable', 'key1', {COLUMN => 'cf:c1', VERSIONS => 4} > >> flush 'usertable' > >> major_compact 'usertable' > >> > >> for the get operation, it echoes > >> COLUMN CELL > >> cf:c1 timestamp=1350764448150, value=v2 > >> cf:c1 timestamp=1350764448114, value=v1 > >> > >> but when I try to get modification timestamp of the hfile generated by > >> major_compact, it is 1350764448000, which is smaller/earlier than that > >> of the keyvalues (which are actually put first). I have run the same > >> test this couple of times, and it is not always: sometime modification > >> timestamp is eariler sometimes it's later than keyvalue's. > >> > >> Anyone knows how HBase set modification timestamp of hfile and that of > >> a keyvalue pair? Or generally, how should I get a timestamp indicating > >> when the last write to a hfile occurs? > >> > >> regards, > >> Yun > >> > > > > > > > > -- > > Have a Nice Day! > > Lohit > -- Have a Nice Day! Lohit
-
Re: When does HBase set modification timestamp of a HFile?
PG 2012-10-21, 11:03
Hi, for the record, I have tried fs api as well, and they are the same with each other. Regards, Yun On Oct 21, 2012, at 12:45 AM, lohit <[EMAIL PROTECTED]> wrote: > Looks like StoreFile.getModificationTime is returning you in seconds > precision. > I am not sure what other APIs could be used here, but if you know the > location of StoreFile try one of the FileSystem APIs > http://hadoop.apache.org/docs/current/api/org/apache/hadoop/fs/FileStatus.html#getModificationTime()> > > 2012/10/20 yun peng <[EMAIL PROTECTED]> > >> Hi lohit and Stack, >> You are right, it always ends with 000. My setup in the test is hbase >> 0.94.2 running on local vfs. Is this tied with current hbase >> implementation or something to do with my particular setup (so I can >> re-configure to make it more precise)? >> Thanks, >> Yun >> >> On Sat, Oct 20, 2012 at 7:54 PM, lohit <[EMAIL PROTECTED]> wrote: >>> Last 3 digits of the timestamp you got from fetching >> getModificationTime() >>> has all zeros 1350764448000 >>> Do you get all zeros all the time? May be the time precision is not what >>> you are looking for here? >>> >>> 2012/10/20 yun peng <[EMAIL PROTECTED]> >>> >>>> Hi, All, >>>> I am trying to understand how and when hbase set the modification >>>> timestamp for hfiles. My original intention is to get a timestamp when >>>> a hfile is generated (when last write to a hfile in compaction). >>>> StoreFile.getModificationTime() looks a good candidate but after >>>> initial tests, it has some behaviour that confuses. >>>> >>>> My test case is like, >>>> @hbase shell >>>> put 'usertable', "key1", 'cf:c1', "v1" >>>> put 'usertable', "key1", 'cf:c1', "v2" >>>> get 'usertable', 'key1', {COLUMN => 'cf:c1', VERSIONS => 4} >>>> flush 'usertable' >>>> major_compact 'usertable' >>>> >>>> for the get operation, it echoes >>>> COLUMN CELL >>>> cf:c1 timestamp=1350764448150, value=v2 >>>> cf:c1 timestamp=1350764448114, value=v1 >>>> >>>> but when I try to get modification timestamp of the hfile generated by >>>> major_compact, it is 1350764448000, which is smaller/earlier than that >>>> of the keyvalues (which are actually put first). I have run the same >>>> test this couple of times, and it is not always: sometime modification >>>> timestamp is eariler sometimes it's later than keyvalue's. >>>> >>>> Anyone knows how HBase set modification timestamp of hfile and that of >>>> a keyvalue pair? Or generally, how should I get a timestamp indicating >>>> when the last write to a hfile occurs? >>>> >>>> regards, >>>> Yun >>>> >>> >>> >>> >>> -- >>> Have a Nice Day! >>> Lohit >> > > > > -- > Have a Nice Day! > Lohit
|
|