|
|
-
Inconsistent Naming in IteratorSetting class.
David Medinets 2012-03-18, 23:25
A Property object used to hold key-value information used to modify the behavior of an Interator. However, these are the methods available:
getProperties setProperties hasProperties addOption removeOption addOptions clearOptions
Is there a reason why the same concept as two names? I'd like to settle on one name and standardise.
Could we change the names to be something like getInteratorSettingProperties? I know that some people are annoyed by longer method names, but when searching through a code base, have unique names is handy. Searching for a generically named method - such as getProperties, returns a lot of false positives.
Is there a list (or example) of supported properties? For example, I see the following options when I type 'help setiter'. How are these options defined in code?
-ageoff,--ageoff an aging off type -agg,--aggregator an aggregating type -majc,--major-compaction applied at major compaction -minc,--minor-compaction applied at minor compaction -regex,--regular-expression a regex matching type -reqvis,--require-visibility a type that omits entries with empty visibilities -scan,--scan-time applied at scan time -vers,--version a versioning type
-
Re: Inconsistent Naming in IteratorSetting class.
Josh Elser 2012-03-19, 01:22
On 03/18/2012 06:25 PM, David Medinets wrote: > A Property object used to hold key-value information used to modify > the behavior of an Interator. However, these are the methods > available: > > getProperties > setProperties > hasProperties > addOption > removeOption > addOptions > clearOptions > > Is there a reason why the same concept as two names? I'd like to > settle on one name and standardise. > > Could we change the names to be something like > getInteratorSettingProperties? I know that some people are annoyed by > longer method names, but when searching through a code base, have > unique names is handy. Searching for a generically named method - such > as getProperties, returns a lot of false positives. > > Is there a list (or example) of supported properties? For example, I > see the following options when I type 'help setiter'. How are these > options defined in code? Take a look at core/org/apache/accumulo/core/util/shell/commands/SetIterCommand.java. The options are defined for each command. > -ageoff,--ageoff an aging off type > -agg,--aggregator an aggregating type > -majc,--major-compaction applied at major compaction > -minc,--minor-compaction applied at minor compaction > -regex,--regular-expression a regex matching type > -reqvis,--require-visibility a type that omits entries with > empty visibilities > -scan,--scan-time applied at scan time > -vers,--version a versioning type
-
Re: Inconsistent Naming in IteratorSetting class.
David Medinets 2012-03-19, 01:55
In the getOptions (another generically-named method!) in SetIterCommand, I see this code:
aggTypeOpt = new Option("agg", "aggregator", false, "an aggregating type"); regexTypeOpt = new Option("regex", "regular-expression", false, "a regex matching type"); versionTypeOpt = new Option("vers", "version", false, "a versioning type"); reqvisTypeOpt = new Option("reqvis", "require-visibility", false, "a type that omits entries with empty visibilities"); ageoffTypeOpt = new Option("ageoff", "ageoff", false, "an aging off type");
It is not clear to me that the command-line option names (like 'agg') are the same values used in the IteratorSetting class. The IteratorSetting seems to hold generic map (which makes sense to provide flexibility).
Let me elaborate via code:
1 IteratorSetting iteratorSetting = new IteratorSetting(1, AgeCombiner.class); 2 iteratorSetting.setName("ageCombiner"); 3 Combiner.setColumns(iteratorSetting, Collections.singletonList(new IteratorSetting.Column("age"))); 4 connector.tableOperations().attachIterator(tableName, iteratorSetting);
Leaving aside the need to call a static class to see the column list, how do I set the iterator type?
I want to create an example for each kind of iterator - in code (i.e., not through the command line).
On Sun, Mar 18, 2012 at 9:22 PM, Josh Elser <[EMAIL PROTECTED]> wrote: > On 03/18/2012 06:25 PM, David Medinets wrote: >> >> A Property object used to hold key-value information used to modify >> the behavior of an Interator. However, these are the methods >> available: >> >> getProperties >> setProperties >> hasProperties >> addOption >> removeOption >> addOptions >> clearOptions >> >> Is there a reason why the same concept as two names? I'd like to >> settle on one name and standardise. >> >> Could we change the names to be something like >> getInteratorSettingProperties? I know that some people are annoyed by >> longer method names, but when searching through a code base, have >> unique names is handy. Searching for a generically named method - such >> as getProperties, returns a lot of false positives. >> >> Is there a list (or example) of supported properties? For example, I >> see the following options when I type 'help setiter'. How are these >> options defined in code? > > Take a look at > core/org/apache/accumulo/core/util/shell/commands/SetIterCommand.java. The > options are defined for each command. > >> -ageoff,--ageoff an aging off type >> -agg,--aggregator an aggregating type >> -majc,--major-compaction applied at major compaction >> -minc,--minor-compaction applied at minor compaction >> -regex,--regular-expression a regex matching type >> -reqvis,--require-visibility a type that omits entries with >> empty visibilities >> -scan,--scan-time applied at scan time >> -vers,--version a versioning type > >
-
Re: Inconsistent Naming in IteratorSetting class.
Josh Elser 2012-03-19, 03:38
Sorry, my bad. Sent you in the wrong direction.
To answer your questions I caused you to ask: getOptions() in the SetIterCommand class is defined in the abstract org.apache.accumulo.util.shell.Command. It's intended that the concrete class overrides the getOptions() method to list the actual options for that Command class.
As for the ambiguity in method names, in IteratorSetting, setProperties ends up calling addOptions (which calls addOption). Not sure if there is any historical significance in the multiple method names doing the same thing. Someone else would have to confirm/deny, but I don't see any reason to have both versions.
Moving on, I guess I'm confused by "each kind of Iterator". Are you referring to the SortedKeyValueIterator interface as opposed to the (deprecated) Aggregator interface and/or Combiner class? Are you just referring to the "time" (minc, majc, scan) the class would be instantiated/run?
In the example you're making, the Column option will be set on the table (from your tableName variable). Then, when the AgeCombiner is instantiated (for whatever time you configured it for: again, majc, minc, or scan time), the options will be passed into the init method of your AgeCombiner via the Map<String, String> argument. Take a look at the init() method in the abstract Combiner class. You'll see it has references to the key you used to set the "age" column to be combined.
To be super clear, from the Wikipedia example: Set on my "wikiIndex" table:
table | table.iterator.majc.UIDAggregator .................. | 19,org.apache.accumulo.examples.wikisearch.iterator.GlobalIndexUidCombiner table | table.iterator.majc.UIDAggregator.opt.all .......... | true table | table.iterator.minc.UIDAggregator .................. | 19,org.apache.accumulo.examples.wikisearch.iterator.GlobalIndexUidCombiner table | table.iterator.minc.UIDAggregator.opt.all .......... | true table | table.iterator.scan.UIDAggregator .................. | 19,org.apache.accumulo.examples.wikisearch.iterator.GlobalIndexUidCombiner table | table.iterator.scan.UIDAggregator.opt.all .......... | true
The UIDAggregator will be run at all three "times" and applied over all columns. To directly answer your final question, there is no "list" of all possible properties for Iterators/Combiners since it's completely dependent on the Iterator/Combiner that was set. Perhaps you could make the documentation on Combiners (docs/combiners.html) to be more explicit about the properties defined there?
Also, let me know if something wasn't clear in that explanation :D
- Josh
On 03/18/2012 08:55 PM, David Medinets wrote: > In the getOptions (another generically-named method!) in > SetIterCommand, I see this code: > > aggTypeOpt = new Option("agg", "aggregator", false, "an aggregating type"); > regexTypeOpt = new Option("regex", "regular-expression", false, "a > regex matching type"); > versionTypeOpt = new Option("vers", "version", false, "a versioning type"); > reqvisTypeOpt = new Option("reqvis", "require-visibility", false, > "a type that omits entries with empty visibilities"); > ageoffTypeOpt = new Option("ageoff", "ageoff", false, "an aging off type"); > > It is not clear to me that the command-line option names (like 'agg') > are the same values used in the IteratorSetting class. The > IteratorSetting seems to hold generic map (which makes sense to > provide flexibility). > > Let me elaborate via code: > > 1 IteratorSetting iteratorSetting = new IteratorSetting(1, AgeCombiner.class); > 2 iteratorSetting.setName("ageCombiner"); > 3 Combiner.setColumns(iteratorSetting, Collections.singletonList(new > IteratorSetting.Column("age"))); > 4 connector.tableOperations().attachIterator(tableName, iteratorSetting); > > Leaving aside the need to call a static class to see the column list, > how do I set the iterator type? > > I want to create an example for each kind of iterator - in code (i.e., > not through the command line).
-
Re: Inconsistent Naming in IteratorSetting class.
Eric Newton 2012-03-19, 14:18
I don't know why there are duplicate methods for the same concept.
I propose we add getOptions, and deprecate getProperties, setProperties, hasProperties.
And getOptions should return an unmodifiable map.
I disagree about the generic names; I like short names. Eclipse finds references pretty well.
-Eric
On Sun, Mar 18, 2012 at 7:25 PM, David Medinets <[EMAIL PROTECTED]>wrote:
> A Property object used to hold key-value information used to modify > the behavior of an Interator. However, these are the methods > available: > > getProperties > setProperties > hasProperties > addOption > removeOption > addOptions > clearOptions > > Is there a reason why the same concept as two names? I'd like to > settle on one name and standardise. > > Could we change the names to be something like > getInteratorSettingProperties? I know that some people are annoyed by > longer method names, but when searching through a code base, have > unique names is handy. Searching for a generically named method - such > as getProperties, returns a lot of false positives. > > Is there a list (or example) of supported properties? For example, I > see the following options when I type 'help setiter'. How are these > options defined in code? > > -ageoff,--ageoff an aging off type > -agg,--aggregator an aggregating type > -majc,--major-compaction applied at major compaction > -minc,--minor-compaction applied at minor compaction > -regex,--regular-expression a regex matching type > -reqvis,--require-visibility a type that omits entries with > empty visibilities > -scan,--scan-time applied at scan time > -vers,--version a versioning type >
-
Re: Inconsistent Naming in IteratorSetting class.
Billie J Rinaldi 2012-03-19, 15:38
On Sunday, March 18, 2012 7:25:49 PM, "David Medinets" <[EMAIL PROTECTED]> wrote: > Is there a list (or example) of supported properties? For example, I > see the following options when I type 'help setiter'. How are these > options defined in code? > > -ageoff,--ageoff an aging off type > -agg,--aggregator an aggregating type > -majc,--major-compaction applied at major compaction > -minc,--minor-compaction applied at minor compaction > -regex,--regular-expression a regex matching type > -reqvis,--require-visibility a type that omits entries with > empty visibilities > -scan,--scan-time applied at scan time > -vers,--version a versioning type
majc, minc, and scan are iterator scopes. Any iterator can be configured with any scope, although many iterators only make sense for the scan scope. The remaining options, ageoff, agg, regex, reqvis, and vers, are shorthand for specifying an iterator class name. They correspond to the AgeOffFilter, AggregatingIterator, RegExFilter, ReqVisFilter, and VersioningIterator. If you don't use one of these shorthand options, you must specify the iterator class with -class classname.
So, you have guessed correctly that these are not the same options that are passed in to an Iterator via an IteratorSetting. Each Iterator can define the options it needs. Our current recommended practice is to use a static method to set these options in an IteratorSetting. We have a mechanism for the Iterators to communicate what options they need to SetIterCommand so that the shell can interactively prompt the user for them. This is the org.apache.accumulo.core.iterators.OptionDescriber interface. Iterators that implement this interface can be configured with the setiter command because there's a way to find out the options they require. Iterators that do not implement it must be configured manually using the config command.
Billie
-
Re: Inconsistent Naming in IteratorSetting class.
David Medinets 2012-03-19, 23:21
On Mon, Mar 19, 2012 at 10:18 AM, Eric Newton <[EMAIL PROTECTED]> wrote: > I disagree about the generic names; I like short names. Eclipse finds > references pretty well.
How are you getting Eclipse to compile the code? Can you send me your .project and .classpath files? I'd love to have Eclipse working with the Accumulo code.
-
Re: Inconsistent Naming in IteratorSetting class.
Josh Elser 2012-03-20, 00:20
On 3/19/2012 7:21 PM, David Medinets wrote: > How are you getting Eclipse to compile the code? Can you send me your > .project and .classpath files? I'd love to have Eclipse working with > the Accumulo code. Have you looked at m2eclipse? You should be able to just import the top-level Accumulo project directly into Eclipse.
-
Re: Inconsistent Naming in IteratorSetting class.
Billie J Rinaldi 2012-03-20, 00:34
On Monday, March 19, 2012 7:21:07 PM, "David Medinets" <[EMAIL PROTECTED]> wrote: > On Mon, Mar 19, 2012 at 10:18 AM, Eric Newton <[EMAIL PROTECTED]> > wrote: > > I disagree about the generic names; I like short names. Eclipse > > finds > > references pretty well. > > How are you getting Eclipse to compile the code? Can you send me your > .project and .classpath files? I'd love to have Eclipse working with > the Accumulo code.
Check the code out as a Maven project. You'll have to install m2e, and then a Maven SCM Handler for your SVN client. I prefer Subclipse; with it, you can right-click on a project when you're in the SVN exploring perspective, then select Check out as Maven Project. The other option is Subversive, and you'll select File > New > Other > Maven > Checkout Maven Projects from SCM.
Billie
|
|