Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Accumulo, mail # dev - feedback on Typo


+
Keith Turner 2012-08-11, 00:07
+
Ed Kohlwey 2012-08-13, 00:11
+
Keith Turner 2012-08-13, 15:02
+
Josh Elser 2012-08-13, 01:36
+
Keith Turner 2012-08-13, 16:06
+
Josh Elser 2012-08-13, 22:03
Copy link to this message
-
Re: feedback on Typo
Christopher Tubbs 2012-08-13, 21:12
Am I right in assuming that this is about simplifying the API for
storing typed data in the key, and not about providing a mechanism for
query. Isn't this really just about storing stuff you've already
decided was a good structure for whatever your query mechanism is?

On Mon, Aug 13, 2012 at 6:03 PM, Josh Elser <[EMAIL PROTECTED]> wrote:
> Even with something as simple as a pair, things can start getting difficult.
> I suppose it really revolves around the level of support you want to provide
> at scan time, e.g. "find all pairs where the second is 'x'?".
>
> Spending a few minutes thinking about it, an index could be a separate table
> but wouldn't necessarily have to be. It depends on the complexity of the
> structure you're trying to index. Using the Pair example again, you could
> reserve a column (family) to place index records in which simply inverts the
> Pair in the colqual.
>
>
> On 08/13/2012 11:06 AM, Keith Turner wrote:
>>
>> On Sun, Aug 12, 2012 at 9:36 PM, Josh Elser<[EMAIL PROTECTED]>  wrote:
>>>
>>> Neat idea, Keith.
>>>
>>> Have you thought about how to support more complex types? Specifically,
>>> arrays, hashes and the nesting of those? Any thoughts about indexing for
>>> those complex types?
>>
>> Yeah I was thinking that would be nice.  I see a lot of users putting
>> multiple types into the row and/or columns.  Could have something like
>> TupleEncoder<List<A>>.   TupleEncoder would need to encode it elements
>> such that it sorts correctly.  However, this may be cumbersome to use
>> if you want to use different types.  For example I want a row composed
>> of a Long and String.  I was thinking of having the following types to
>> handle this case.
>>
>> class Pair<A,B>  extends LexEncoder{
>>     Pair(LexEncoder<A>  enc1, LexEncoder<B>  enc2);
>>     A getFirst(){}
>>     B getSecond(){}
>> }
>>
>> class Triple<A,B,C>{//follows same pattern as Pair}
>> class Quadruple<A,B,C,D>{//follows same pattern as Pair}
>>
>> This would allow a user to write code like the following that makes it
>> easy to work with a row composed of a Long and String.
>>
>> Pair<Long, String>  pair;
>> long l = pair.getFirst();
>> String s = pair.getSecond();
>>
>> I am still thinking the tuple concept through.
>>
>> I was not considering indexing.  I assuming you mean creating an index
>> in another table?
>>
>>> Initial thoughts are that it would make the most sense to place Typo at
>>> the
>>> contrib level (or something equivalent). The reason being: Typo doesn't
>>> change the underlying functionality of Accumulo; it only provides a layer
>>> on
>>> top of it that makes life easier for developers.
>>
>> I think putting it in contrib makes sense.
>>
>>>
>>> On 08/10/2012 07:07 PM, Keith Turner wrote:
>>>>
>>>> I put together a simple abstraction layer for Accumulo that makes it
>>>> easier to read and write Java objects to Accumulo key and value
>>>> fields.  The data written to Accumulo sort correctly
>>>> lexicographically.
>>>>
>>>> I put the code on github and would like some feedback on the design
>>>> and whether it should be included with Accumulo.
>>>>
>>>> https://github.com/keith-turner/typo
>>>>
>>>> Its still a little rough and I need to add encoder for all of the
>>>> primitive types.
>>>>
>>>> Keith
+
Ed Kohlwey 2012-08-15, 13:19
+
Keith Turner 2012-08-15, 13:38
+
Marc Parisi 2012-08-15, 13:45
+
Ed Kohlwey 2012-08-15, 14:09
+
Keith Turner 2012-08-15, 16:50
+
Ed Kohlwey 2012-08-16, 13:55
+
Keith Turner 2012-08-14, 17:29
+
Billie Rinaldi 2012-08-13, 16:34
+
Keith Turner 2012-08-13, 16:55