


Retrieving 2 separate timestamps' values
Hi,
Is there any way of retrieving two values with totally different timestamps from a table?
I am using timestamps as iteration counts, and I would like to be able to get at each iteration (besides the previous iteration results from table) some precomputed amounts I save at some columns with timestamp 0, avoiding the cost of retrieving all table's versions.
The only way I have come up with is to save the precomputed amounts redundantly at all timestamps up to the maximum possible.
Does anyone have an idea on a more efficient way of dealing with this?
Thanks and regards, IP
+
Ioakim Perros 20120828, 00:01

Re: Retrieving 2 separate timestamps' values
You timestamp as in version? Can you describe your scenario with more concrete example?
On Mon, Aug 27, 2012 at 5:01 PM, Ioakim Perros <[EMAIL PROTECTED]> wrote:
> Hi, > > Is there any way of retrieving two values with totally different > timestamps from a table? > > I am using timestamps as iteration counts, and I would like to be able to > get at each iteration (besides the previous iteration results from table) > some precomputed amounts I save at some columns with timestamp 0, avoiding > the cost of retrieving all table's versions. > > The only way I have come up with is to save the precomputed amounts > redundantly at all timestamps up to the maximum possible. > > Does anyone have an idea on a more efficient way of dealing with this? > > Thanks and regards, > IP >
+
Mohit Anchlia 20120828, 00:33

Re: Retrieving 2 separate timestamps' values
Of course, thank you for responding.
I have an iterative procedure where I get and put data from/to an HBase table, and I am setting at each Put the timestamp equal to each iteration's number, as it is efficient to check for convergence in this way (by just retrieving the 2 last versions of my columns).
Some amounts of my equations are the same through iterations, and I save them (serialized) at two specific columns of my table with timestamp equal to zero. The rest of my table's columns contain the (serialized) alternating results of my iterations.
The thing is that the cached amounts are necessary to be read at each and every iteration, but it would not be efficient to scan all versions of all columns of my table, just to retrieve the previous iteration's results plus the initially saved cached amounts.
For example, being at iteration 30 I would like to retrieve only columns 3 and 4 with timestamp 29 and columns 0 and 1 with timestamp 0.
With the current HBase's API, I am not sure if this is possible and the solution I described at my previous message (by storing columns 0 and 1 at all timestamps up to 40 for example) seems inefficient.
Any ideas?
Thanks and regards, IP
On 08/28/2012 03:33 AM, Mohit Anchlia wrote: > You timestamp as in version? Can you describe your scenario with more > concrete example? > > On Mon, Aug 27, 2012 at 5:01 PM, Ioakim Perros <[EMAIL PROTECTED]> wrote: > >> Hi, >> >> Is there any way of retrieving two values with totally different >> timestamps from a table? >> >> I am using timestamps as iteration counts, and I would like to be able to >> get at each iteration (besides the previous iteration results from table) >> some precomputed amounts I save at some columns with timestamp 0, avoiding >> the cost of retrieving all table's versions. >> >> The only way I have come up with is to save the precomputed amounts >> redundantly at all timestamps up to the maximum possible. >> >> Does anyone have an idea on a more efficient way of dealing with this? >> >> Thanks and regards, >> IP >>
+
Ioakim Perros 20120828, 00:53

Re: Retrieving 2 separate timestamps' values
Have you thought of making your row key as key+timestamp? And then you can do scan on the columns itself?
On Mon, Aug 27, 2012 at 5:53 PM, Ioakim Perros <[EMAIL PROTECTED]> wrote:
> Of course, thank you for responding. > > I have an iterative procedure where I get and put data from/to an HBase > table, and I am setting at each Put the timestamp equal to each iteration's > number, as it is efficient to check for convergence in this way (by just > retrieving the 2 last versions of my columns). > > Some amounts of my equations are the same through iterations, and I save > them (serialized) at two specific columns of my table with timestamp equal > to zero. The rest of my table's columns contain the (serialized) > alternating results of my iterations. > > The thing is that the cached amounts are necessary to be read at each and > every iteration, but it would not be efficient to scan all versions of all > columns of my table, just to retrieve the previous iteration's results plus > the initially saved cached amounts. > > For example, being at iteration 30 I would like to retrieve only columns 3 > and 4 with timestamp 29 and columns 0 and 1 with timestamp 0. > > With the current HBase's API, I am not sure if this is possible and the > solution I described at my previous message (by storing columns 0 and 1 at > all timestamps up to 40 for example) seems inefficient. > > Any ideas? > > Thanks and regards, > IP > > > On 08/28/2012 03:33 AM, Mohit Anchlia wrote: > >> You timestamp as in version? Can you describe your scenario with more >> concrete example? >> >> On Mon, Aug 27, 2012 at 5:01 PM, Ioakim Perros <[EMAIL PROTECTED]> >> wrote: >> >> Hi, >>> >>> Is there any way of retrieving two values with totally different >>> timestamps from a table? >>> >>> I am using timestamps as iteration counts, and I would like to be able to >>> get at each iteration (besides the previous iteration results from table) >>> some precomputed amounts I save at some columns with timestamp 0, >>> avoiding >>> the cost of retrieving all table's versions. >>> >>> The only way I have come up with is to save the precomputed amounts >>> redundantly at all timestamps up to the maximum possible. >>> >>> Does anyone have an idea on a more efficient way of dealing with this? >>> >>> Thanks and regards, >>> IP >>> >>> >
+
Mohit Anchlia 20120828, 01:10

Re: Retrieving 2 separate timestamps' values
Unfortunately the way I am reading/writing data from/to parts of my table would be incompatible with this solution.
In any case, thank you very much for your time.
On Aug 28, 2012, at 4:10, Mohit Anchlia <[EMAIL PROTECTED]> wrote:
> Have you thought of making your row key as key+timestamp? And then you can > do scan on the columns itself? > > On Mon, Aug 27, 2012 at 5:53 PM, Ioakim Perros <[EMAIL PROTECTED]> wrote: > >> Of course, thank you for responding. >> >> I have an iterative procedure where I get and put data from/to an HBase >> table, and I am setting at each Put the timestamp equal to each iteration's >> number, as it is efficient to check for convergence in this way (by just >> retrieving the 2 last versions of my columns). >> >> Some amounts of my equations are the same through iterations, and I save >> them (serialized) at two specific columns of my table with timestamp equal >> to zero. The rest of my table's columns contain the (serialized) >> alternating results of my iterations. >> >> The thing is that the cached amounts are necessary to be read at each and >> every iteration, but it would not be efficient to scan all versions of all >> columns of my table, just to retrieve the previous iteration's results plus >> the initially saved cached amounts. >> >> For example, being at iteration 30 I would like to retrieve only columns 3 >> and 4 with timestamp 29 and columns 0 and 1 with timestamp 0. >> >> With the current HBase's API, I am not sure if this is possible and the >> solution I described at my previous message (by storing columns 0 and 1 at >> all timestamps up to 40 for example) seems inefficient. >> >> Any ideas? >> >> Thanks and regards, >> IP >> >> >> On 08/28/2012 03:33 AM, Mohit Anchlia wrote: >> >>> You timestamp as in version? Can you describe your scenario with more >>> concrete example? >>> >>> On Mon, Aug 27, 2012 at 5:01 PM, Ioakim Perros <[EMAIL PROTECTED]> >>> wrote: >>> >>> Hi, >>>> >>>> Is there any way of retrieving two values with totally different >>>> timestamps from a table? >>>> >>>> I am using timestamps as iteration counts, and I would like to be able to >>>> get at each iteration (besides the previous iteration results from table) >>>> some precomputed amounts I save at some columns with timestamp 0, >>>> avoiding >>>> the cost of retrieving all table's versions. >>>> >>>> The only way I have come up with is to save the precomputed amounts >>>> redundantly at all timestamps up to the maximum possible. >>>> >>>> Does anyone have an idea on a more efficient way of dealing with this? >>>> >>>> Thanks and regards, >>>> IP >>>> >>>> >>
+
Ioakim Perros 20120828, 01:41

