Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - Pagination with HBase - getting previous page of data


Copy link to this message
-
Re: Pagination with HBase - getting previous page of data
anil gupta 2013-01-28, 03:31
That's alright..I thought that you have come-up with a killer solution. So,
got curious to hear your ideas. ;)
It seems like your below mentioned solution will not work on filtering on
non row-key columns since when you are deciding the page numbers you are
only considering rowkey.

Thanks,
Anil

On Fri, Jan 25, 2013 at 6:58 PM, Jean-Marc Spaggiari <
[EMAIL PROTECTED]> wrote:

> Hi Anil,
>
> I don't have a solution. I never tought about that ;) But I was
> thinking about something like you create a 2nd table where you place
> the raw number (4 bytes) then the raw key. You go directly to a
> specific page, you query by the number, found the key, and you know
> where to start you scan in the main table.
>
> The issue is properly the number for each lines since with a MR you
> don't know where you are from the beginning. But you can built
> something where you store the line number from the beginning of the
> region, then when all regions are parsed you can reconstruct the total
> numbering... That should work...
>
> JM
>
> 2013/1/25, anil gupta <[EMAIL PROTECTED]>:
> > Inline...
> >
> > On Fri, Jan 25, 2013 at 9:17 AM, Jean-Marc Spaggiari <
> > [EMAIL PROTECTED]> wrote:
> >
> >> Hi Anil,
> >>
> >> The issue is that all the other sub-sequent page start should be moved
> >> too...
> >>
> > Yes, this is a possibility. Hence the Developer has to take care of this
> > case. It might also be possible that the pageSize is not a hard limit on
> > number of results(more like a hint or suggestion on size). I would say it
> > varies by use case.
> >
> >>
> >> so if you want to jump directly to page n, you might be totally
> >> shifted because of all the data inserted in the meantime...
> >>
> >> If you want a real complete pagination feature, you might want to have
> >> a coproccessor or a MR updating another table refering to the
> >> pages....
> >>
> > Well, the solution depends on the use case. I will be doing pagination in
> > HBase for a restful service but till now i am unable to find any reason
> why
> > this cant be done at application level.
> > Are you suggesting to use MR for paging in HBase? If yes, how?
> > How would you use another table for pagination?what would you store in
> the
> > extra table?
> >
> >>
> >> JM
> >>
> >> 2013/1/25, anil gupta <[EMAIL PROTECTED]>:
> >> > Hi Vijay,
> >> >
> >> > I've done paging in HBase by using Scan only(no pagination filter) as
> >> > Mohammed has explained. However it was just an experimental stuff. It
> >> works
> >> > but Jean raised a very good point.
> >> > Find my answer inline to fix the problem that Jean reported.
> >> >
> >> >
> >> > On Fri, Jan 25, 2013 at 4:38 AM, Jean-Marc Spaggiari <
> >> > [EMAIL PROTECTED]> wrote:
> >> >
> >> >> Hi Vijay,
> >> >>
> >> >> If, while the user os scrolling forward, you store the key of each
> >> >> page, then you will be able to go back to a specific page, and jump
> >> >> forward back up to where he was.
> >> >>
> >> >> The only issue is that, if while the user is scrolling the table,
> >> >> someone insert a row between the last of a page, and the first of the
> >> >> next page, you will never see this row.
> >> >>
> >> >> Let's take this exemaple.
> >> >>
> >> >> You have 10 items per page.
> >> >>
> >> >> 010 020 030 040 050 060 070 080 090 100 is the first page.
> >> >> 110 120 130 140 150 160 170 180 190 200 is the second one.
> >> >>
> >> >> Now, if someone insert 101... If will be just after 100 and before
> >> >> 110.
> >> >>
> >> > Anil: Instead of scanning from 010 to 100, scan from 010 to 110. Then
> >> > we
> >> > wont have this problem. So, i mean to say that
> >> > startRow(firstRowKeyofPage(N)) and stopRow(firstRowKeyofPage(N+1)).
> >> > This
> >> > would fix it. Also, in that case number of results might exceed the
> >> > pageSize. So you might need to handle this logic.
> >> >
> >> >>
> >> >> When you will display 10 rows starting at 010 you will stop just
> >> >> before 101... And for the next page you will start at 110... And 101

Thanks & Regards,
Anil Gupta