Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo, mail # user - How to remove entire row at the server side?


Copy link to this message
-
Re: How to remove entire row at the server side?
Terry P. 2013-11-06, 20:49
Eyes of an eagle Billie!  com is correct, but after viewing
"org.apache.accumulo" so many times, my brain was stuck on org and I goofed
in my setiter syntax.

With THAT corrected, here is the new error:

root@meta> setiter -class
com.esa.accumulo.iterators.ExpirationTimestampPurgeFilter -n expTsFilter -p
20 -scan -t itertest
2013-11-06 14:46:28,280 [shell.Shell] ERROR:
org.apache.accumulo.core.util.shell.ShellCommandException: Command could
not be initialized (Unable to load
com.esa.accumulo.iterators.ExpirationTimestampPurgeFilter as type
org.apache.accumulo.core.iterators.OptionDescriber; configure with 'config'
instead)

On Wed, Nov 6, 2013 at 2:43 PM, Billie Rinaldi <[EMAIL PROTECTED]>wrote:

> Is there a typo in the package name?  One place says "com" and the other
> "org".
>
>
> On Wed, Nov 6, 2013 at 12:37 PM, Terry P. <[EMAIL PROTECTED]> wrote:
>
>> Hi William, many thanks for the explanation of scan time versus
>> compaction time. I'll look through the classes again and note where the
>> remove versus suppress wordings are used and open a ticket.
>>
>> As mentioned, I only dabble in java, but regardless of that fact at this
>> point I'm the one that has to get this done. I've hobbled together my first
>> attempt, but I get the following error where I try to add it as a scan
>> iterator for testing:
>>
>> root@meta> setiter -class
>> org.esa.accumulo.iterators.ExpirationTimestampPurgeFilter -n expTsFilter -p
>> 20 -scan -t itertest
>> 2013-11-06 14:06:34,914 [shell.Shell] ERROR:
>> org.apache.accumulo.core.util.shell.ShellCommandException: Command could
>> not be initialized (Servers are unable to load
>> org.esa.accumulo.iterators.ExpirationTimestampPurgeFilter as type
>> org.apache.accumulo.core.iterators.SortedKeyValueIterator)
>>
>> Here's my source.  Note that the value stored in the expTs ColFam is in
>> the format "yyyyMMddHHmmssS", which I convert to a long for a direct
>> comparison to System.currentTimeMillis(). I only overrode the init and
>> acceptRow methods, hoping the others would work as-is from the base class.
>>
>> One clarification: turns out expTs is the ColumnFamily, and the ingest
>> app does not assign a ColumnQualifier for expTs. So to amend my prior table
>> layout (including the datetime format):
>>
>>
>> Format: Key:CF:CQ:Value
>> abc:data:title:"My fantastic data"
>> abc:data:content:<bytedata>
>> abc:creTs::20130804171412445
>> abc:*expTs*::20131104171412445
>> ... 6-8 more columns of data per row ...
>>
>> where *expTs* is the ColumnFamily to determine if the entire row should
>> be removed based on whether its value is <= NOW.  If a row has not yet been
>> assigned an expiration date, expTs will not be set and the ColumnFamily
>> will not yet be present.  Seems like an odd choice to use distinct Column
>> Families, without Column Qualifiers, but that's how the ingest app was done.
>>
>> I greatly appreciate any advice you can provide.
>>
>> package com.esa.accumulo.iterators;
>>
>> import java.io.IOException;
>> import java.text.ParseException;
>> import java.text.SimpleDateFormat;
>> import java.util.Date;
>> import java.util.Map;
>>
>> import org.apache.accumulo.core.data.Key;
>> import org.apache.accumulo.core.data.Value;
>> import org.apache.accumulo.core.iterators.IteratorEnvironment;
>> import org.apache.accumulo.core.iterators.SortedKeyValueIterator;
>> import org.apache.accumulo.core.iterators.user.RowFilter;
>>
>> /**
>>  * A filter that removes rows based on the column designated as the
>> "expiration timestamp" column family.
>>  *
>>  * It removes the row if the value in the expirationTimestamp column is
>> less than currentTime.
>>  *
>>  * TODO: The designation of the expirationTimestamp ColumnFamily and its
>> DateFormat is
>>  * set in the iterator options when the iterator is applied to the table.
>> (For
>>  * now it is hardcoded to match the format used in the Solr-Accumulo
>> plugin)
>>  */
>> public class ExpirationTimestampPurgeFilter extends RowFilter {