Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Accumulo >> mail # user >> Filtering on column qualifier


+
Marc Reichman 2013-08-21, 14:00
+
John Vines 2013-08-21, 14:49
+
Slater, David M. 2013-08-21, 23:58
+
John Vines 2013-08-22, 00:38
+
Marc Reichman 2013-08-22, 14:19
+
David Medinets 2013-08-22, 16:16
Copy link to this message
-
Re: Filtering on column qualifier
I haven't considered that. Would that allow me to specify it in the
client-side code and not worry about spreading JARs around? It is a very
basic need, in my scan iterator loop right now is:

            String matchScoreString = key.getColumnQualifier().toString();
            Double score = Double.parseDouble(matchScoreString);

            if (threshold != null && threshold > score) {
                // TODO: figure out if this is possible to do via
data-local scan iterator
                continue;
            }

What is the pattern for including a groovy snippet for a scan iterator?
On Thu, Aug 22, 2013 at 11:16 AM, David Medinets
<[EMAIL PROTECTED]>wrote:

> Have you thought of writing a filter class that takes some bit of groovy
> for execution inside the accept method, depending on how efficient you need
> to be and how changeable your constraints are.
>
>
> On Thu, Aug 22, 2013 at 10:19 AM, Marc Reichman <
> [EMAIL PROTECTED]> wrote:
>
>> Extending looked like a bit of a boondoggle, because all of the useful
>> fields in the class are private, not protected. I also ran into another
>> architectural question, how does one pass a value (a-la constructor) into
>> one of these classes? If I'm going to use this to filter based on a
>> threshold, I'd need to pass that threshold in somehow.
>>
>>
>>
>>
>> On Wed, Aug 21, 2013 at 9:49 AM, John Vines <[EMAIL PROTECTED]> wrote:
>>
>>> There's no way to extend the ColumnQualietyFilter via configuration, but
>>> it sounds like you are on top of it. You just need to extend the class,
>>> possibly copy a bit of code, and change the equality check to a compareTo
>>> after converting the Strings to Doubles.
>>>
>>>
>>> On Wed, Aug 21, 2013 at 10:00 AM, Marc Reichman <
>>> [EMAIL PROTECTED]> wrote:
>>>
>>>> I have some data stored in Accumulo with some scores stored as column
>>>> qualifiers (there was an older thread about this). I would like to find a
>>>> way to do thresholding when retrieving the data without retrieving it all
>>>> and then manually filtering out items below my threshold.
>>>>
>>>> I know I can "fetch" column qualifiers which are exact.
>>>>
>>>> I've seen the ColumnQualifierFilter, which I assume is what's in play
>>>> when I fetch qualifiers. Is there a reasonable pattern to extend this and
>>>> try to use it as a scan iterator so I can do things like "greater than" a
>>>> value which will be interpreted as a Double vs. the string equality going
>>>> on now?
>>>>
>>>> Thanks,
>>>> Marc
>>>>
>>>
>>>
>>
>
+
David Medinets 2013-08-22, 17:10
+
John Stoneham 2013-08-22, 19:09
+
John Vines 2013-08-22, 23:16
+
Marc Reichman 2013-08-22, 17:35