|
|
-
AccumuloInputFormat.setRanges
Seastrom, Jessica K 2013-01-02, 23:30
Using AccumuloInputFormat.setRanges(conf, someRange), should I expect that the Key,Values as input to the Map method will be restricted to those keys in the set contained in someRange?
My current implementation filters K,V pairs using the DistributedCache to hold the query terms (if(myDistributedCacheQueryTermsHashSet.contains(key.getRow())…) but I wonder if AccumuloInputFormat.setRanges is an alternate implementation. It didn't seem to filter as above, but perhaps I'm just not implementing it correctly.
Thank you, Jessica
-
Re: AccumuloInputFormat.setRanges
Josh Elser 2013-01-03, 00:18
Yup, the AccumuloInputFormat does just that. It also has some additional features that make it desirable to use (tablet-server local mappers, for example).
Can you describe the issues you had/how you were using it? The AccumuloInputFormat should be fairly straightforward to use.
On 01/02/2013 06:30 PM, Seastrom, Jessica K wrote: > Using AccumuloInputFormat.setRanges(conf, someRange), should I expect that the Key,Values as input to the Map method will be restricted to those keys in the set contained in someRange? > > My current implementation filters K,V pairs using the DistributedCache to hold the query terms (if(myDistributedCacheQueryTermsHashSet.contains(key.getRow())�) but I wonder if AccumuloInputFormat.setRanges is an alternate implementation. It didn't seem to filter as above, but perhaps I'm just not implementing it correctly. > > Thank you, > Jessica > >
-
Re: AccumuloInputFormat.setRanges
John Vines 2013-01-03, 22:33
As Josh said, using a series of Ranges would be more efficient. Depending on the quantity, there is a known bug in older releases when you have a LOT of ranges, but barring that it should work for you. Instead of doing a range containing the entire table, you can do a bunch of single row ranges which correspond to the query terms. The mappers should only ever get data which was expressed in the set of ranges supplied. On Wed, Jan 2, 2013 at 6:30 PM, Seastrom, Jessica K <[EMAIL PROTECTED]>wrote:
> Using AccumuloInputFormat.setRanges(conf, someRange), should I expect that > the Key,Values as input to the Map method will be restricted to those keys > in the set contained in someRange? > > My current implementation filters K,V pairs using the DistributedCache to > hold the query terms > (if(myDistributedCacheQueryTermsHashSet.contains(key.getRow())…) but I > wonder if AccumuloInputFormat.setRanges is an alternate implementation. It > didn't seem to filter as above, but perhaps I'm just not implementing it > correctly. > > Thank you, > Jessica > >
|
|