Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> Pagination through families / columns?


Copy link to this message
-
Re: Pagination through families / columns?
When we change versions to 1 from 3 on hbase table schema, things
appear work right.

-Jack

On Mon, May 16, 2011 at 12:14 PM, Jean-Daniel Cryans
<[EMAIL PROTECTED]> wrote:
> I doesn't look like you are doing something wrong, also I looked at
> the unit tests and they seem to cover the basic usage of
> ColumnPaginationFilter. Can you try removing the addFamily and
> setMaxVersions to see if it has any effect?
>
> Thx,
>
> J-D
>
> On Fri, May 13, 2011 at 6:27 PM, Matthew Ward <[EMAIL PROTECTED]> wrote:
>> Ok so I am running a couple tests to see if I will be able to successfully hack up the thrift api. We are running version .89, here's the code I have in my test:
>>
>>  Filter newFilter = new ColumnPaginationFilter(5,0);
>>        Get myget = new Get(Bytes.toBytes("row1"));
>>        myget.setFilter(newFilter);
>>        myget.addFamily(Bytes.toBytes("att"));
>>        myget.setMaxVersions(1);
>>        Result  myR = table.get(myget);
>>        System.out.println(myR.toString());
>>        for (KeyValue kv : myR.list() ) {
>>                System.out.println( Bytes.toString( kv.getQualifier() ) + " : " + Bytes.toString( kv.getValue() ) );
>>        }
>>
>> In the table we have:
>>
>> hbase(main):003:0> scan 'myTable'
>> ROW                                      COLUMN+CELL
>>  myLittleRow                             column=att:someQualifier, timestamp=1305335658005, value=Some Value
>>  row1                                    column=att:col1, timestamp=1305329505518, value=hello
>>  row1                                    column=att:col2, timestamp=1305329526015, value=world
>>  row1                                    column=att:col3, timestamp=1305329532252, value=foo
>>  row1                                    column=att:col4, timestamp=1305329537921, value=bar
>>  row1                                    column=att:col5, timestamp=1305326707231, value=1
>>
>>
>> Running that code gives me the following output:
>>
>> keyvalues={row1/att:col1/1305329505518/Put/vlen=5, row1/att:col2/1305329526015/Put/vlen=5}
>> col1 : hello
>> col2 : world
>>
>>
>> I am trying to determine if we are just doing something wrong or if filter is ran before filtering maxversions, etc. The javadoc for .90 says it happens after the ttl, version, etc filtering.
>>
>> Further I need to verify if this is something that we can do with get / and or scan.
>>
>> Thanks!
>>
>>
>> On May 12, 2011, at 9:37 PM, Jean-Daniel Cryans wrote:
>>
>>> You'd have to hack it up into the thrift server, shouldn't be so bad
>>> but there's no such doc.
>>>
>>> J-D
>>>
>>> On Thu, May 12, 2011 at 8:26 PM, Matthew Ward <[EMAIL PROTECTED]> wrote:
>>>> Oh interesting, is there a way to access it via thrift (from PHP)? Are there some docs I can read up on it?
>>>>
>>>> Thanks!
>>>> -Matt
>>>>
>>>> On May 12, 2011, at 3:08 PM, Panayotis Antonopoulos wrote:
>>>>
>>>>>
>>>>> If I understand what you need, there is the ColumnPaginationFilter that does exactly what you mention.
>>>>>
>>>>>> From: [EMAIL PROTECTED]
>>>>>> Subject: Pagination through families / columns?
>>>>>> Date: Thu, 12 May 2011 13:49:16 -0700
>>>>>> To: [EMAIL PROTECTED]
>>>>>>
>>>>>> Hey Guys,
>>>>>>
>>>>>> Not sure if this functionality is available or not, if its not consider this a feature request :).
>>>>>>
>>>>>> The main summary is that rows can contain massive amounts of data, so we can narrow
>>>>>> selection by family. However, if the family is large enough is there a way to grab parts of
>>>>>> the family using and offset and a limit? To compound it further, what if the column names
>>>>>> are dynamic.
>>>>>>
>>>>>> Example
>>>>>>
>>>>>> table 'foo'
>>>>>>  family 'bar'
>>>>>>    column '1111'
>>>>>>    column '1112'
>>>>>>    column '1113'
>>>>>>    ...
>>>>>>   column '9999'
>>>>>>
>>>>>>
>>>>>> The request I would like to make is
>>>>>>
>>>>>> 'get', 'foo', 'somerowid' , 'bar:', {LIMIT => 10}
>>>>>>
>>>>>> After discovering column name and cursing through
>>>>>>
>>>>>> 'get', 'foo', 'somerowid' , 'bar:', {LIMIT => 10, OFFSET => '1121'}