Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo >> mail # dev >> Re: Could combiners be coded using groovy?


Copy link to this message
-
Re: Could combiners be coded using groovy?
The base semantics of Accumulo are actually more of multimap, and the
VersioningIterator is what turns a table into a map. You could try using a
custom comparator when you construct your TreeMap to effectively turn it
into a multimap. Something like this ought to do the trick:

// warning -- totally untested code -- might not even compile
class NeverEqual implements Comparator<Key> {
  public int compare(Key a, Key b) {
    int result = a.compareTo(b);
    if(result == 0)
      return 1;
    return result;
  }
}

Cheers,
Adam
On Sat, May 19, 2012 at 5:57 PM, David Medinets <[EMAIL PROTECTED]>wrote:

> I finally got a chance to try your suggestion. But I'm confused
> because the semantics of a TreeMap seem different from those of
> Accumulo. For example, here I insert some data into the TreeMap:
>
>                TreeMap<Key, Value> tm = new TreeMap<Key, Value>();
>                Key key = new Key(new Text("row"), new Text("cf"), new
> Text("cq"),
> new Text(""));
>                Value value = new Value("13".getBytes());
>                tm.put(key, value);
>
>                key = new Key(new Text("row"), new Text("cf"), new
> Text("cq"), new Text(""));
>                value = new Value("14".getBytes());
>                tm.put(key, value);
>
>                key = new Key(new Text("row"), new Text("cf"), new
> Text("cq"), new Text(""));
>                value = new Value("15".getBytes());
>                tm.put(key, value);
>
> And then I try to use a SummingCombiner which I have used successfully
> against Accumulo. Here is that code:
>
>                Map<String,String> options = new HashMap<String, String>();
>                options.put("type", "STRING");
>
>                SummingCombiner iter = new SummingCombiner();
>
>                IteratorSetting is = new IteratorSetting(1,
> SummingCombiner.class, options);
>                Combiner.setCombineAllColumns(is, true);
>
>                iter.init(new SortedMapIterator(tm), is.getOptions(), null);
>                iter.seek(new Range(), new ArrayList<ByteSequence>(),
> false);
>
>                while (iter.hasTop()) {
>                        Key k = iter.getTopKey();
>                        Value v = iter.getTopValue();
>                         System.out.println("K: " + k + "  V: " + v);
>                        iter.next();
>                }
>                System.out.println("END");
>
> Here is the output:
>
> START
> K: row cf:cq [] 9223372036854775807 false  V: 15
> END
>
> The SummingCombiner is only seeing one record which makes sense since
> the keys overwrite each other in the TreeMap. Am I missing something?
>
> On Tue, Apr 10, 2012 at 3:57 PM, Billie J Rinaldi
> <[EMAIL PROTECTED]> wrote:
> > I'm not familiar with Groovy, but it sounds interesting.  I could
> recommend some ways to test your iterator before you push it out to
> Accumulo.  You can make some fake data for a unit test by creating a
> TreeMap<Key,Value> and then using a SortedMapIterator to turn that into a
> source for your iterator.  A lot of our unit tests look like the following.
> >
> >  TreeMap<Key,Value> tm = new TreeMap<Key,Value>();
> >  // put some data into the tree map
> >
> >  MyIterator iter = new MyIterator();
> >
> >  IteratorSetting is = new IteratorSetting(1, MyIterator.class);
> >  MyIterator.setSomeOption(is, option);
> >
> >  iter.init(new SortedMapIterator(tm), is.getOptions(), null);
> >  iter.seek(new Range(), new ArrayList<ByteSequence>(), false);
> >
> >  while (iter.hasTop()) {
> >    Key k = iter.getTopKey();
> >    Value v = iter.getTopValue();
> >    // check that k and v are what you expected
> >    iter.next();
> >  }
> >
> > Another option is to use the ClientSideIteratorScanner to test your
> iterator in your local JVM before running it on a tserver.
> >
> > Billie
> >
> >
> > On Sunday, April 8, 2012 11:08:05 PM, "David Medinets" <
> [EMAIL PROTECTED]> wrote:
> >> I was working with combiners and seeing the jar file loaded and
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB