|
|
-
Pagination through families / columns?
Matthew Ward 2011-05-12, 20:49
Hey Guys,
Not sure if this functionality is available or not, if its not consider this a feature request :).
The main summary is that rows can contain massive amounts of data, so we can narrow selection by family. However, if the family is large enough is there a way to grab parts of the family using and offset and a limit? To compound it further, what if the column names are dynamic.
Example
table 'foo' family 'bar' column '1111' column '1112' column '1113' ... column '9999' The request I would like to make is
'get', 'foo', 'somerowid' , 'bar:', {LIMIT => 10}
After discovering column name and cursing through
'get', 'foo', 'somerowid' , 'bar:', {LIMIT => 10, OFFSET => '1121'} or maybe 'get', 'foo', 'somerowid' , 'bar:1121', {LIMIT => 10}
Other thoughts would be if its reversible or not {ORDER => -1}, but more importantly available to the thrift client.
-
RE: Pagination through families / columns?
Panayotis Antonopoulos 2011-05-12, 22:08
If I understand what you need, there is the ColumnPaginationFilter that does exactly what you mention.
> From: [EMAIL PROTECTED] > Subject: Pagination through families / columns? > Date: Thu, 12 May 2011 13:49:16 -0700 > To: [EMAIL PROTECTED] > > Hey Guys, > > Not sure if this functionality is available or not, if its not consider this a feature request :). > > The main summary is that rows can contain massive amounts of data, so we can narrow > selection by family. However, if the family is large enough is there a way to grab parts of > the family using and offset and a limit? To compound it further, what if the column names > are dynamic. > > Example > > table 'foo' > family 'bar' > column '1111' > column '1112' > column '1113' > ... > column '9999' > > > The request I would like to make is > > 'get', 'foo', 'somerowid' , 'bar:', {LIMIT => 10} > > After discovering column name and cursing through > > 'get', 'foo', 'somerowid' , 'bar:', {LIMIT => 10, OFFSET => '1121'} > or maybe 'get', 'foo', 'somerowid' , 'bar:1121', {LIMIT => 10} > > Other thoughts would be if its reversible or not {ORDER => -1}, but more importantly > available to the thrift client. >
-
Re: Pagination through families / columns?
Matthew Ward 2011-05-13, 03:26
Oh interesting, is there a way to access it via thrift (from PHP)? Are there some docs I can read up on it?
Thanks! -Matt
On May 12, 2011, at 3:08 PM, Panayotis Antonopoulos wrote:
> > If I understand what you need, there is the ColumnPaginationFilter that does exactly what you mention. > >> From: [EMAIL PROTECTED] >> Subject: Pagination through families / columns? >> Date: Thu, 12 May 2011 13:49:16 -0700 >> To: [EMAIL PROTECTED] >> >> Hey Guys, >> >> Not sure if this functionality is available or not, if its not consider this a feature request :). >> >> The main summary is that rows can contain massive amounts of data, so we can narrow >> selection by family. However, if the family is large enough is there a way to grab parts of >> the family using and offset and a limit? To compound it further, what if the column names >> are dynamic. >> >> Example >> >> table 'foo' >> family 'bar' >> column '1111' >> column '1112' >> column '1113' >> ... >> column '9999' >> >> >> The request I would like to make is >> >> 'get', 'foo', 'somerowid' , 'bar:', {LIMIT => 10} >> >> After discovering column name and cursing through >> >> 'get', 'foo', 'somerowid' , 'bar:', {LIMIT => 10, OFFSET => '1121'} >> or maybe 'get', 'foo', 'somerowid' , 'bar:1121', {LIMIT => 10} >> >> Other thoughts would be if its reversible or not {ORDER => -1}, but more importantly >> available to the thrift client. >> >
-
Re: Pagination through families / columns?
Jean-Daniel Cryans 2011-05-13, 04:37
You'd have to hack it up into the thrift server, shouldn't be so bad but there's no such doc.
J-D
On Thu, May 12, 2011 at 8:26 PM, Matthew Ward <[EMAIL PROTECTED]> wrote: > Oh interesting, is there a way to access it via thrift (from PHP)? Are there some docs I can read up on it? > > Thanks! > -Matt > > On May 12, 2011, at 3:08 PM, Panayotis Antonopoulos wrote: > >> >> If I understand what you need, there is the ColumnPaginationFilter that does exactly what you mention. >> >>> From: [EMAIL PROTECTED] >>> Subject: Pagination through families / columns? >>> Date: Thu, 12 May 2011 13:49:16 -0700 >>> To: [EMAIL PROTECTED] >>> >>> Hey Guys, >>> >>> Not sure if this functionality is available or not, if its not consider this a feature request :). >>> >>> The main summary is that rows can contain massive amounts of data, so we can narrow >>> selection by family. However, if the family is large enough is there a way to grab parts of >>> the family using and offset and a limit? To compound it further, what if the column names >>> are dynamic. >>> >>> Example >>> >>> table 'foo' >>> family 'bar' >>> column '1111' >>> column '1112' >>> column '1113' >>> ... >>> column '9999' >>> >>> >>> The request I would like to make is >>> >>> 'get', 'foo', 'somerowid' , 'bar:', {LIMIT => 10} >>> >>> After discovering column name and cursing through >>> >>> 'get', 'foo', 'somerowid' , 'bar:', {LIMIT => 10, OFFSET => '1121'} >>> or maybe 'get', 'foo', 'somerowid' , 'bar:1121', {LIMIT => 10} >>> >>> Other thoughts would be if its reversible or not {ORDER => -1}, but more importantly >>> available to the thrift client. >>> >> > >
-
Re: Pagination through families / columns?
Matthew Ward 2011-05-14, 01:27
Ok so I am running a couple tests to see if I will be able to successfully hack up the thrift api. We are running version .89, here's the code I have in my test:
Filter newFilter = new ColumnPaginationFilter(5,0); Get myget = new Get(Bytes.toBytes("row1")); myget.setFilter(newFilter); myget.addFamily(Bytes.toBytes("att")); myget.setMaxVersions(1); Result myR = table.get(myget); System.out.println(myR.toString()); for (KeyValue kv : myR.list() ) { System.out.println( Bytes.toString( kv.getQualifier() ) + " : " + Bytes.toString( kv.getValue() ) ); }
In the table we have:
hbase(main):003:0> scan 'myTable' ROW COLUMN+CELL myLittleRow column=att:someQualifier, timestamp=1305335658005, value=Some Value row1 column=att:col1, timestamp=1305329505518, value=hello row1 column=att:col2, timestamp=1305329526015, value=world row1 column=att:col3, timestamp=1305329532252, value=foo row1 column=att:col4, timestamp=1305329537921, value=bar row1 column=att:col5, timestamp=1305326707231, value=1 Running that code gives me the following output:
keyvalues={row1/att:col1/1305329505518/Put/vlen=5, row1/att:col2/1305329526015/Put/vlen=5} col1 : hello col2 : world I am trying to determine if we are just doing something wrong or if filter is ran before filtering maxversions, etc. The javadoc for .90 says it happens after the ttl, version, etc filtering.
Further I need to verify if this is something that we can do with get / and or scan.
Thanks! On May 12, 2011, at 9:37 PM, Jean-Daniel Cryans wrote:
> You'd have to hack it up into the thrift server, shouldn't be so bad > but there's no such doc. > > J-D > > On Thu, May 12, 2011 at 8:26 PM, Matthew Ward <[EMAIL PROTECTED]> wrote: >> Oh interesting, is there a way to access it via thrift (from PHP)? Are there some docs I can read up on it? >> >> Thanks! >> -Matt >> >> On May 12, 2011, at 3:08 PM, Panayotis Antonopoulos wrote: >> >>> >>> If I understand what you need, there is the ColumnPaginationFilter that does exactly what you mention. >>> >>>> From: [EMAIL PROTECTED] >>>> Subject: Pagination through families / columns? >>>> Date: Thu, 12 May 2011 13:49:16 -0700 >>>> To: [EMAIL PROTECTED] >>>> >>>> Hey Guys, >>>> >>>> Not sure if this functionality is available or not, if its not consider this a feature request :). >>>> >>>> The main summary is that rows can contain massive amounts of data, so we can narrow >>>> selection by family. However, if the family is large enough is there a way to grab parts of >>>> the family using and offset and a limit? To compound it further, what if the column names >>>> are dynamic. >>>> >>>> Example >>>> >>>> table 'foo' >>>> family 'bar' >>>> column '1111' >>>> column '1112' >>>> column '1113' >>>> ... >>>> column '9999' >>>> >>>> >>>> The request I would like to make is >>>> >>>> 'get', 'foo', 'somerowid' , 'bar:', {LIMIT => 10} >>>> >>>> After discovering column name and cursing through >>>> >>>> 'get', 'foo', 'somerowid' , 'bar:', {LIMIT => 10, OFFSET => '1121'} >>>> or maybe 'get', 'foo', 'somerowid' , 'bar:1121', {LIMIT => 10} >>>> >>>> Other thoughts would be if its reversible or not {ORDER => -1}, but more importantly >>>> available to the thrift client. >>>> >>> >> >>
-
Re: Pagination through families / columns?
Jean-Daniel Cryans 2011-05-16, 19:14
I doesn't look like you are doing something wrong, also I looked at the unit tests and they seem to cover the basic usage of ColumnPaginationFilter. Can you try removing the addFamily and setMaxVersions to see if it has any effect?
Thx,
J-D
On Fri, May 13, 2011 at 6:27 PM, Matthew Ward <[EMAIL PROTECTED]> wrote: > Ok so I am running a couple tests to see if I will be able to successfully hack up the thrift api. We are running version .89, here's the code I have in my test: > > Filter newFilter = new ColumnPaginationFilter(5,0); > Get myget = new Get(Bytes.toBytes("row1")); > myget.setFilter(newFilter); > myget.addFamily(Bytes.toBytes("att")); > myget.setMaxVersions(1); > Result myR = table.get(myget); > System.out.println(myR.toString()); > for (KeyValue kv : myR.list() ) { > System.out.println( Bytes.toString( kv.getQualifier() ) + " : " + Bytes.toString( kv.getValue() ) ); > } > > In the table we have: > > hbase(main):003:0> scan 'myTable' > ROW COLUMN+CELL > myLittleRow column=att:someQualifier, timestamp=1305335658005, value=Some Value > row1 column=att:col1, timestamp=1305329505518, value=hello > row1 column=att:col2, timestamp=1305329526015, value=world > row1 column=att:col3, timestamp=1305329532252, value=foo > row1 column=att:col4, timestamp=1305329537921, value=bar > row1 column=att:col5, timestamp=1305326707231, value=1 > > > Running that code gives me the following output: > > keyvalues={row1/att:col1/1305329505518/Put/vlen=5, row1/att:col2/1305329526015/Put/vlen=5} > col1 : hello > col2 : world > > > I am trying to determine if we are just doing something wrong or if filter is ran before filtering maxversions, etc. The javadoc for .90 says it happens after the ttl, version, etc filtering. > > Further I need to verify if this is something that we can do with get / and or scan. > > Thanks! > > > On May 12, 2011, at 9:37 PM, Jean-Daniel Cryans wrote: > >> You'd have to hack it up into the thrift server, shouldn't be so bad >> but there's no such doc. >> >> J-D >> >> On Thu, May 12, 2011 at 8:26 PM, Matthew Ward <[EMAIL PROTECTED]> wrote: >>> Oh interesting, is there a way to access it via thrift (from PHP)? Are there some docs I can read up on it? >>> >>> Thanks! >>> -Matt >>> >>> On May 12, 2011, at 3:08 PM, Panayotis Antonopoulos wrote: >>> >>>> >>>> If I understand what you need, there is the ColumnPaginationFilter that does exactly what you mention. >>>> >>>>> From: [EMAIL PROTECTED] >>>>> Subject: Pagination through families / columns? >>>>> Date: Thu, 12 May 2011 13:49:16 -0700 >>>>> To: [EMAIL PROTECTED] >>>>> >>>>> Hey Guys, >>>>> >>>>> Not sure if this functionality is available or not, if its not consider this a feature request :). >>>>> >>>>> The main summary is that rows can contain massive amounts of data, so we can narrow >>>>> selection by family. However, if the family is large enough is there a way to grab parts of >>>>> the family using and offset and a limit? To compound it further, what if the column names >>>>> are dynamic. >>>>> >>>>> Example >>>>> >>>>> table 'foo' >>>>> family 'bar' >>>>> column '1111' >>>>> column '1112' >>>>> column '1113' >>>>> ... >>>>> column '9999' >>>>> >>>>> >>>>> The request I would like to make is >>>>> >>>>> 'get', 'foo', 'somerowid' , 'bar:', {LIMIT => 10} >>>>> >>>>> After discovering column name and cursing through >>>>> >>>>> 'get', 'foo', 'somerowid' , 'bar:', {LIMIT => 10, OFFSET => '1121'} >>>>> or maybe 'get', 'foo', 'somerowid' , 'bar:1121', {LIMIT => 10} >>>>> >>>>> Other thoughts would be if its reversible or not {ORDER => -1}, but more importantly >>>>> available to the thrift client. >>>>> >>>> >>> >>> > >
-
Re: Pagination through families / columns?
Jack Levin 2011-05-16, 19:23
When we change versions to 1 from 3 on hbase table schema, things appear work right.
-Jack
On Mon, May 16, 2011 at 12:14 PM, Jean-Daniel Cryans <[EMAIL PROTECTED]> wrote: > I doesn't look like you are doing something wrong, also I looked at > the unit tests and they seem to cover the basic usage of > ColumnPaginationFilter. Can you try removing the addFamily and > setMaxVersions to see if it has any effect? > > Thx, > > J-D > > On Fri, May 13, 2011 at 6:27 PM, Matthew Ward <[EMAIL PROTECTED]> wrote: >> Ok so I am running a couple tests to see if I will be able to successfully hack up the thrift api. We are running version .89, here's the code I have in my test: >> >> Filter newFilter = new ColumnPaginationFilter(5,0); >> Get myget = new Get(Bytes.toBytes("row1")); >> myget.setFilter(newFilter); >> myget.addFamily(Bytes.toBytes("att")); >> myget.setMaxVersions(1); >> Result myR = table.get(myget); >> System.out.println(myR.toString()); >> for (KeyValue kv : myR.list() ) { >> System.out.println( Bytes.toString( kv.getQualifier() ) + " : " + Bytes.toString( kv.getValue() ) ); >> } >> >> In the table we have: >> >> hbase(main):003:0> scan 'myTable' >> ROW COLUMN+CELL >> myLittleRow column=att:someQualifier, timestamp=1305335658005, value=Some Value >> row1 column=att:col1, timestamp=1305329505518, value=hello >> row1 column=att:col2, timestamp=1305329526015, value=world >> row1 column=att:col3, timestamp=1305329532252, value=foo >> row1 column=att:col4, timestamp=1305329537921, value=bar >> row1 column=att:col5, timestamp=1305326707231, value=1 >> >> >> Running that code gives me the following output: >> >> keyvalues={row1/att:col1/1305329505518/Put/vlen=5, row1/att:col2/1305329526015/Put/vlen=5} >> col1 : hello >> col2 : world >> >> >> I am trying to determine if we are just doing something wrong or if filter is ran before filtering maxversions, etc. The javadoc for .90 says it happens after the ttl, version, etc filtering. >> >> Further I need to verify if this is something that we can do with get / and or scan. >> >> Thanks! >> >> >> On May 12, 2011, at 9:37 PM, Jean-Daniel Cryans wrote: >> >>> You'd have to hack it up into the thrift server, shouldn't be so bad >>> but there's no such doc. >>> >>> J-D >>> >>> On Thu, May 12, 2011 at 8:26 PM, Matthew Ward <[EMAIL PROTECTED]> wrote: >>>> Oh interesting, is there a way to access it via thrift (from PHP)? Are there some docs I can read up on it? >>>> >>>> Thanks! >>>> -Matt >>>> >>>> On May 12, 2011, at 3:08 PM, Panayotis Antonopoulos wrote: >>>> >>>>> >>>>> If I understand what you need, there is the ColumnPaginationFilter that does exactly what you mention. >>>>> >>>>>> From: [EMAIL PROTECTED] >>>>>> Subject: Pagination through families / columns? >>>>>> Date: Thu, 12 May 2011 13:49:16 -0700 >>>>>> To: [EMAIL PROTECTED] >>>>>> >>>>>> Hey Guys, >>>>>> >>>>>> Not sure if this functionality is available or not, if its not consider this a feature request :). >>>>>> >>>>>> The main summary is that rows can contain massive amounts of data, so we can narrow >>>>>> selection by family. However, if the family is large enough is there a way to grab parts of >>>>>> the family using and offset and a limit? To compound it further, what if the column names >>>>>> are dynamic. >>>>>> >>>>>> Example >>>>>> >>>>>> table 'foo' >>>>>> family 'bar' >>>>>> column '1111' >>>>>> column '1112' >>>>>> column '1113' >>>>>> ... >>>>>> column '9999' >>>>>> >>>>>> >>>>>> The request I would like to make is >>>>>> >>>>>> 'get', 'foo', 'somerowid' , 'bar:', {LIMIT => 10} >>>>>> >>>>>> After discovering column name and cursing through >>>>>> >>>>>> 'get', 'foo', 'somerowid' , 'bar:', {LIMIT => 10, OFFSET => '1121'}
-
Re: Pagination through families / columns?
Stack 2011-05-16, 19:36
That sounds like a bug in the filter. Make an issue? St.Ack
On Mon, May 16, 2011 at 12:23 PM, Jack Levin <[EMAIL PROTECTED]> wrote: > When we change versions to 1 from 3 on hbase table schema, things > appear work right. > > -Jack > > On Mon, May 16, 2011 at 12:14 PM, Jean-Daniel Cryans > <[EMAIL PROTECTED]> wrote: >> I doesn't look like you are doing something wrong, also I looked at >> the unit tests and they seem to cover the basic usage of >> ColumnPaginationFilter. Can you try removing the addFamily and >> setMaxVersions to see if it has any effect? >> >> Thx, >> >> J-D >> >> On Fri, May 13, 2011 at 6:27 PM, Matthew Ward <[EMAIL PROTECTED]> wrote: >>> Ok so I am running a couple tests to see if I will be able to successfully hack up the thrift api. We are running version .89, here's the code I have in my test: >>> >>> Filter newFilter = new ColumnPaginationFilter(5,0); >>> Get myget = new Get(Bytes.toBytes("row1")); >>> myget.setFilter(newFilter); >>> myget.addFamily(Bytes.toBytes("att")); >>> myget.setMaxVersions(1); >>> Result myR = table.get(myget); >>> System.out.println(myR.toString()); >>> for (KeyValue kv : myR.list() ) { >>> System.out.println( Bytes.toString( kv.getQualifier() ) + " : " + Bytes.toString( kv.getValue() ) ); >>> } >>> >>> In the table we have: >>> >>> hbase(main):003:0> scan 'myTable' >>> ROW COLUMN+CELL >>> myLittleRow column=att:someQualifier, timestamp=1305335658005, value=Some Value >>> row1 column=att:col1, timestamp=1305329505518, value=hello >>> row1 column=att:col2, timestamp=1305329526015, value=world >>> row1 column=att:col3, timestamp=1305329532252, value=foo >>> row1 column=att:col4, timestamp=1305329537921, value=bar >>> row1 column=att:col5, timestamp=1305326707231, value=1 >>> >>> >>> Running that code gives me the following output: >>> >>> keyvalues={row1/att:col1/1305329505518/Put/vlen=5, row1/att:col2/1305329526015/Put/vlen=5} >>> col1 : hello >>> col2 : world >>> >>> >>> I am trying to determine if we are just doing something wrong or if filter is ran before filtering maxversions, etc. The javadoc for .90 says it happens after the ttl, version, etc filtering. >>> >>> Further I need to verify if this is something that we can do with get / and or scan. >>> >>> Thanks! >>> >>> >>> On May 12, 2011, at 9:37 PM, Jean-Daniel Cryans wrote: >>> >>>> You'd have to hack it up into the thrift server, shouldn't be so bad >>>> but there's no such doc. >>>> >>>> J-D >>>> >>>> On Thu, May 12, 2011 at 8:26 PM, Matthew Ward <[EMAIL PROTECTED]> wrote: >>>>> Oh interesting, is there a way to access it via thrift (from PHP)? Are there some docs I can read up on it? >>>>> >>>>> Thanks! >>>>> -Matt >>>>> >>>>> On May 12, 2011, at 3:08 PM, Panayotis Antonopoulos wrote: >>>>> >>>>>> >>>>>> If I understand what you need, there is the ColumnPaginationFilter that does exactly what you mention. >>>>>> >>>>>>> From: [EMAIL PROTECTED] >>>>>>> Subject: Pagination through families / columns? >>>>>>> Date: Thu, 12 May 2011 13:49:16 -0700 >>>>>>> To: [EMAIL PROTECTED] >>>>>>> >>>>>>> Hey Guys, >>>>>>> >>>>>>> Not sure if this functionality is available or not, if its not consider this a feature request :). >>>>>>> >>>>>>> The main summary is that rows can contain massive amounts of data, so we can narrow >>>>>>> selection by family. However, if the family is large enough is there a way to grab parts of >>>>>>> the family using and offset and a limit? To compound it further, what if the column names >>>>>>> are dynamic. >>>>>>> >>>>>>> Example >>>>>>> >>>>>>> table 'foo' >>>>>>> family 'bar' >>>>>>> column '1111' >>>>>>> column '1112' >>>>>>> column '1113' >>>>>>> ... >>>>>>> column '9999'
|
|