Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # dev >> Re: Could combiners be coded using groovy?


Copy link to this message
-
Re: Could combiners be coded using groovy?
The base semantics of Accumulo are actually more of multimap, and the
VersioningIterator is what turns a table into a map. You could try using a
custom comparator when you construct your TreeMap to effectively turn it
into a multimap. Something like this ought to do the trick:

// warning -- totally untested code -- might not even compile
class NeverEqual implements Comparator<Key> {
  public int compare(Key a, Key b) {
    int result = a.compareTo(b);
    if(result == 0)
      return 1;
    return result;
  }
}

Cheers,
Adam
On Sat, May 19, 2012 at 5:57 PM, David Medinets <[EMAIL PROTECTED]>wrote:

> I finally got a chance to try your suggestion. But I'm confused
> because the semantics of a TreeMap seem different from those of
> Accumulo. For example, here I insert some data into the TreeMap:
>
>                TreeMap<Key, Value> tm = new TreeMap<Key, Value>();
>                Key key = new Key(new Text("row"), new Text("cf"), new
> Text("cq"),
> new Text(""));
>                Value value = new Value("13".getBytes());
>                tm.put(key, value);
>
>                key = new Key(new Text("row"), new Text("cf"), new
> Text("cq"), new Text(""));
>                value = new Value("14".getBytes());
>                tm.put(key, value);
>
>                key = new Key(new Text("row"), new Text("cf"), new
> Text("cq"), new Text(""));
>                value = new Value("15".getBytes());
>                tm.put(key, value);
>
> And then I try to use a SummingCombiner which I have used successfully
> against Accumulo. Here is that code:
>
>                Map<String,String> options = new HashMap<String, String>();
>                options.put("type", "STRING");
>
>                SummingCombiner iter = new SummingCombiner();
>
>                IteratorSetting is = new IteratorSetting(1,
> SummingCombiner.class, options);
>                Combiner.setCombineAllColumns(is, true);
>
>                iter.init(new SortedMapIterator(tm), is.getOptions(), null);
>                iter.seek(new Range(), new ArrayList<ByteSequence>(),
> false);
>
>                while (iter.hasTop()) {
>                        Key k = iter.getTopKey();
>                        Value v = iter.getTopValue();
>                         System.out.println("K: " + k + "  V: " + v);
>                        iter.next();
>                }
>                System.out.println("END");
>
> Here is the output:
>
> START
> K: row cf:cq [] 9223372036854775807 false  V: 15
> END
>
> The SummingCombiner is only seeing one record which makes sense since
> the keys overwrite each other in the TreeMap. Am I missing something?
>
> On Tue, Apr 10, 2012 at 3:57 PM, Billie J Rinaldi
> <[EMAIL PROTECTED]> wrote:
> > I'm not familiar with Groovy, but it sounds interesting.  I could
> recommend some ways to test your iterator before you push it out to
> Accumulo.  You can make some fake data for a unit test by creating a
> TreeMap<Key,Value> and then using a SortedMapIterator to turn that into a
> source for your iterator.  A lot of our unit tests look like the following.
> >
> >  TreeMap<Key,Value> tm = new TreeMap<Key,Value>();
> >  // put some data into the tree map
> >
> >  MyIterator iter = new MyIterator();
> >
> >  IteratorSetting is = new IteratorSetting(1, MyIterator.class);
> >  MyIterator.setSomeOption(is, option);
> >
> >  iter.init(new SortedMapIterator(tm), is.getOptions(), null);
> >  iter.seek(new Range(), new ArrayList<ByteSequence>(), false);
> >
> >  while (iter.hasTop()) {
> >    Key k = iter.getTopKey();
> >    Value v = iter.getTopValue();
> >    // check that k and v are what you expected
> >    iter.next();
> >  }
> >
> > Another option is to use the ClientSideIteratorScanner to test your
> iterator in your local JVM before running it on a tserver.
> >
> > Billie
> >
> >
> > On Sunday, April 8, 2012 11:08:05 PM, "David Medinets" <
> [EMAIL PROTECTED]> wrote:
> >> I was working with combiners and seeing the jar file loaded and