Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - Pagination with HBase - getting previous page of data


Copy link to this message
-
Re: Pagination with HBase - getting previous page of data
Jean-Marc Spaggiari 2013-01-29, 21:08
No, no killer solution here ;)

But I'm still thinking about that because I might have to implement
some pagination options soon...

As you are saying, it's only working on the row-key, but if you want
to do the same-thing on non-rowkey, you might have to create a
secondary index table...

JM

2013/1/27, anil gupta <[EMAIL PROTECTED]>:
> That's alright..I thought that you have come-up with a killer solution. So,
> got curious to hear your ideas. ;)
> It seems like your below mentioned solution will not work on filtering on
> non row-key columns since when you are deciding the page numbers you are
> only considering rowkey.
>
> Thanks,
> Anil
>
> On Fri, Jan 25, 2013 at 6:58 PM, Jean-Marc Spaggiari <
> [EMAIL PROTECTED]> wrote:
>
>> Hi Anil,
>>
>> I don't have a solution. I never tought about that ;) But I was
>> thinking about something like you create a 2nd table where you place
>> the raw number (4 bytes) then the raw key. You go directly to a
>> specific page, you query by the number, found the key, and you know
>> where to start you scan in the main table.
>>
>> The issue is properly the number for each lines since with a MR you
>> don't know where you are from the beginning. But you can built
>> something where you store the line number from the beginning of the
>> region, then when all regions are parsed you can reconstruct the total
>> numbering... That should work...
>>
>> JM
>>
>> 2013/1/25, anil gupta <[EMAIL PROTECTED]>:
>> > Inline...
>> >
>> > On Fri, Jan 25, 2013 at 9:17 AM, Jean-Marc Spaggiari <
>> > [EMAIL PROTECTED]> wrote:
>> >
>> >> Hi Anil,
>> >>
>> >> The issue is that all the other sub-sequent page start should be moved
>> >> too...
>> >>
>> > Yes, this is a possibility. Hence the Developer has to take care of
>> > this
>> > case. It might also be possible that the pageSize is not a hard limit
>> > on
>> > number of results(more like a hint or suggestion on size). I would say
>> > it
>> > varies by use case.
>> >
>> >>
>> >> so if you want to jump directly to page n, you might be totally
>> >> shifted because of all the data inserted in the meantime...
>> >>
>> >> If you want a real complete pagination feature, you might want to have
>> >> a coproccessor or a MR updating another table refering to the
>> >> pages....
>> >>
>> > Well, the solution depends on the use case. I will be doing pagination
>> > in
>> > HBase for a restful service but till now i am unable to find any reason
>> why
>> > this cant be done at application level.
>> > Are you suggesting to use MR for paging in HBase? If yes, how?
>> > How would you use another table for pagination?what would you store in
>> the
>> > extra table?
>> >
>> >>
>> >> JM
>> >>
>> >> 2013/1/25, anil gupta <[EMAIL PROTECTED]>:
>> >> > Hi Vijay,
>> >> >
>> >> > I've done paging in HBase by using Scan only(no pagination filter)
>> >> > as
>> >> > Mohammed has explained. However it was just an experimental stuff.
>> >> > It
>> >> works
>> >> > but Jean raised a very good point.
>> >> > Find my answer inline to fix the problem that Jean reported.
>> >> >
>> >> >
>> >> > On Fri, Jan 25, 2013 at 4:38 AM, Jean-Marc Spaggiari <
>> >> > [EMAIL PROTECTED]> wrote:
>> >> >
>> >> >> Hi Vijay,
>> >> >>
>> >> >> If, while the user os scrolling forward, you store the key of each
>> >> >> page, then you will be able to go back to a specific page, and jump
>> >> >> forward back up to where he was.
>> >> >>
>> >> >> The only issue is that, if while the user is scrolling the table,
>> >> >> someone insert a row between the last of a page, and the first of
>> >> >> the
>> >> >> next page, you will never see this row.
>> >> >>
>> >> >> Let's take this exemaple.
>> >> >>
>> >> >> You have 10 items per page.
>> >> >>
>> >> >> 010 020 030 040 050 060 070 080 090 100 is the first page.
>> >> >> 110 120 130 140 150 160 170 180 190 200 is the second one.
>> >> >>
>> >> >> Now, if someone insert 101... If will be just after 100 and before
>> >> >> 110.
>> >> >>