Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Retrieving 2 separate timestamps' values


Copy link to this message
-
Re: Retrieving 2 separate timestamps' values
Have you thought of making your row key as key+timestamp? And then you can
do scan on the columns itself?

On Mon, Aug 27, 2012 at 5:53 PM, Ioakim Perros <[EMAIL PROTECTED]> wrote:

> Of course, thank you for responding.
>
> I have an iterative procedure where I get and put data from/to an HBase
> table, and I am setting at each Put the timestamp equal to each iteration's
> number, as it is efficient to check for convergence in this way (by just
> retrieving the 2 last versions of my columns).
>
> Some amounts of my equations are the same through iterations, and I save
> them (serialized) at two specific columns of my table with timestamp equal
> to zero. The rest of my table's columns contain the (serialized)
> alternating results of my iterations.
>
> The thing is that the cached amounts are necessary to be read at each and
> every iteration, but it would not be efficient to scan all versions of all
> columns of my table, just to retrieve the previous iteration's results plus
> the initially saved cached amounts.
>
> For example, being at iteration 30 I would like to retrieve only columns 3
> and 4 with timestamp 29 and columns 0 and 1 with timestamp 0.
>
> With the current HBase's API, I am not sure if this is possible and the
> solution I described at my previous message (by storing columns 0 and 1 at
> all timestamps up to 40 for example) seems inefficient.
>
> Any ideas?
>
> Thanks and regards,
> IP
>
>
> On 08/28/2012 03:33 AM, Mohit Anchlia wrote:
>
>> You timestamp as in version? Can you describe your scenario with more
>> concrete example?
>>
>> On Mon, Aug 27, 2012 at 5:01 PM, Ioakim Perros <[EMAIL PROTECTED]>
>> wrote:
>>
>> Hi,
>>>
>>> Is there any way of retrieving two values with totally different
>>> timestamps from a table?
>>>
>>> I am using timestamps as iteration counts, and I would like to be able to
>>> get at each iteration (besides the previous iteration results from table)
>>> some pre-computed amounts I save at some columns with timestamp 0,
>>> avoiding
>>> the cost of retrieving all table's versions.
>>>
>>> The only way I have come up with is to save the pre-computed amounts
>>> redundantly at all timestamps up to the maximum possible.
>>>
>>> Does anyone have an idea on a more efficient way of dealing with this?
>>>
>>> Thanks and regards,
>>> IP
>>>
>>>
>