Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Accumulo, mail # dev - feedback on Typo

Keith Turner 2012-08-11, 00:07
Ed Kohlwey 2012-08-13, 00:11
Keith Turner 2012-08-13, 15:02
Josh Elser 2012-08-13, 01:36
Keith Turner 2012-08-13, 16:06
Copy link to this message
Re: feedback on Typo
Josh Elser 2012-08-13, 22:03
Even with something as simple as a pair, things can start getting
difficult. I suppose it really revolves around the level of support you
want to provide at scan time, e.g. "find all pairs where the second is

Spending a few minutes thinking about it, an index could be a separate
table but wouldn't necessarily have to be. It depends on the complexity
of the structure you're trying to index. Using the Pair example again,
you could reserve a column (family) to place index records in which
simply inverts the Pair in the colqual.

On 08/13/2012 11:06 AM, Keith Turner wrote:
> On Sun, Aug 12, 2012 at 9:36 PM, Josh Elser<[EMAIL PROTECTED]>  wrote:
>> Neat idea, Keith.
>> Have you thought about how to support more complex types? Specifically,
>> arrays, hashes and the nesting of those? Any thoughts about indexing for
>> those complex types?
> Yeah I was thinking that would be nice.  I see a lot of users putting
> multiple types into the row and/or columns.  Could have something like
> TupleEncoder<List<A>>.   TupleEncoder would need to encode it elements
> such that it sorts correctly.  However, this may be cumbersome to use
> if you want to use different types.  For example I want a row composed
> of a Long and String.  I was thinking of having the following types to
> handle this case.
> class Pair<A,B>  extends LexEncoder{
>     Pair(LexEncoder<A>  enc1, LexEncoder<B>  enc2);
>     A getFirst(){}
>     B getSecond(){}
> }
> class Triple<A,B,C>{//follows same pattern as Pair}
> class Quadruple<A,B,C,D>{//follows same pattern as Pair}
> This would allow a user to write code like the following that makes it
> easy to work with a row composed of a Long and String.
> Pair<Long, String>  pair;
> long l = pair.getFirst();
> String s = pair.getSecond();
> I am still thinking the tuple concept through.
> I was not considering indexing.  I assuming you mean creating an index
> in another table?
>> Initial thoughts are that it would make the most sense to place Typo at the
>> contrib level (or something equivalent). The reason being: Typo doesn't
>> change the underlying functionality of Accumulo; it only provides a layer on
>> top of it that makes life easier for developers.
> I think putting it in contrib makes sense.
>> On 08/10/2012 07:07 PM, Keith Turner wrote:
>>> I put together a simple abstraction layer for Accumulo that makes it
>>> easier to read and write Java objects to Accumulo key and value
>>> fields.  The data written to Accumulo sort correctly
>>> lexicographically.
>>> I put the code on github and would like some feedback on the design
>>> and whether it should be included with Accumulo.
>>> https://github.com/keith-turner/typo
>>> Its still a little rough and I need to add encoder for all of the
>>> primitive types.
>>> Keith
Christopher Tubbs 2012-08-13, 21:12
Ed Kohlwey 2012-08-15, 13:19
Keith Turner 2012-08-15, 13:38
Marc Parisi 2012-08-15, 13:45
Ed Kohlwey 2012-08-15, 14:09
Keith Turner 2012-08-15, 16:50
Ed Kohlwey 2012-08-16, 13:55
Keith Turner 2012-08-14, 17:29
Billie Rinaldi 2012-08-13, 16:34
Keith Turner 2012-08-13, 16:55