Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # user >> Filtering on column qualifier


Copy link to this message
-
Re: Filtering on column qualifier
I apologize for my dense-ness, but could you walk me through this? Is there
some form of existing scan iterator which interprets groovy? Or is this
something I would build?
On Thu, Aug 22, 2013 at 12:10 PM, David Medinets
<[EMAIL PROTECTED]>wrote:

> The advantage is that you'd only write the iterator once and deploy it to
> the cluster. Then the groovy snippet changes its behavior. You'd save
> passing the data to your client code, but more work would be done by the
> accumulo cluster.
>
>
> On Thu, Aug 22, 2013 at 12:33 PM, Marc Reichman <
> [EMAIL PROTECTED]> wrote:
>
>> I haven't considered that. Would that allow me to specify it in the
>> client-side code and not worry about spreading JARs around? It is a very
>> basic need, in my scan iterator loop right now is:
>>
>>             String matchScoreString = key.getColumnQualifier().toString();
>>             Double score = Double.parseDouble(matchScoreString);
>>
>>             if (threshold != null && threshold > score) {
>>                 // TODO: figure out if this is possible to do via
>> data-local scan iterator
>>                 continue;
>>             }
>>
>> What is the pattern for including a groovy snippet for a scan iterator?
>>
>>
>> On Thu, Aug 22, 2013 at 11:16 AM, David Medinets <
>> [EMAIL PROTECTED]> wrote:
>>
>>> Have you thought of writing a filter class that takes some bit of groovy
>>> for execution inside the accept method, depending on how efficient you need
>>> to be and how changeable your constraints are.
>>>
>>>
>>> On Thu, Aug 22, 2013 at 10:19 AM, Marc Reichman <
>>> [EMAIL PROTECTED]> wrote:
>>>
>>>> Extending looked like a bit of a boondoggle, because all of the useful
>>>> fields in the class are private, not protected. I also ran into another
>>>> architectural question, how does one pass a value (a-la constructor) into
>>>> one of these classes? If I'm going to use this to filter based on a
>>>> threshold, I'd need to pass that threshold in somehow.
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Aug 21, 2013 at 9:49 AM, John Vines <[EMAIL PROTECTED]> wrote:
>>>>
>>>>> There's no way to extend the ColumnQualietyFilter via configuration,
>>>>> but it sounds like you are on top of it. You just need to extend the class,
>>>>> possibly copy a bit of code, and change the equality check to a compareTo
>>>>> after converting the Strings to Doubles.
>>>>>
>>>>>
>>>>> On Wed, Aug 21, 2013 at 10:00 AM, Marc Reichman <
>>>>> [EMAIL PROTECTED]> wrote:
>>>>>
>>>>>> I have some data stored in Accumulo with some scores stored as column
>>>>>> qualifiers (there was an older thread about this). I would like to find a
>>>>>> way to do thresholding when retrieving the data without retrieving it all
>>>>>> and then manually filtering out items below my threshold.
>>>>>>
>>>>>> I know I can "fetch" column qualifiers which are exact.
>>>>>>
>>>>>> I've seen the ColumnQualifierFilter, which I assume is what's in play
>>>>>> when I fetch qualifiers. Is there a reasonable pattern to extend this and
>>>>>> try to use it as a scan iterator so I can do things like "greater than" a
>>>>>> value which will be interpreted as a Double vs. the string equality going
>>>>>> on now?
>>>>>>
>>>>>> Thanks,
>>>>>> Marc
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>