Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Avro >> mail # user >> Is Avro/Trevni strictly read-only?


Copy link to this message
-
Re: Is Avro/Trevni strictly read-only?
interesting -- thanks for the link! Let me know if you have any more Kiji
questions.
Cheers
- Aaron
On Wed, Jan 30, 2013 at 6:49 PM, Russell Jurney <[EMAIL PROTECTED]>wrote:

> I'm looking at Panthera, I'll check out Kiji too. Inferring the schema
> from the first record and creating a table it what is done in Voldemort's
> build/push job, so I'll look into that.
>
>
> https://github.com/voldemort/voldemort/wiki/Build-and-Push-Jobs-for-Voldemort-Read-Only-Stores
>
> Russell Jurney http://datasyndrome.com
>
> On Jan 30, 2013, at 6:33 PM, Aaron Kimball <[EMAIL PROTECTED]> wrote:
>
> Hi Russell,
>
> Great question.  Kiji is more strongly typed than systems like MongoDB.
> While your schema can evolve (using Avro evolution) without structurally
> updating existing data, you still need to specify your Avro schemas in a
> data dictionary. It's challenging to author systems in Java (as is typical
> of HBase/HDFS/MapReduce-facing applications) without some strong typing in
> the persistence layer. You wind up reading a lot of other peoples' code to
> figure out what types were written -- assuming you can find the code (or
> the hbase columns) in the first place.
>
> You can create table schemas either "manually" by filling out a JSON /
> Avro-based table layout specification, or you can use the DDL shell which
> lets you CREATE TABLE, ALTER TABLE, etc. in a pretty quick way. Once the
> table's set up, then you can write to it.  I think the DDL shell included
> with the bento box makes this a reasonably low-overhead process.
>
> We don't currently have any Pig integration. We've made some initial
> proof-of-concept progress on a StorageHandler that lets Hive query Kiji,
> but it's not in a ready state yet. Someone (you? :) could write a Pig
> integration; Pig already supports Avro I think. And you could even make it
> analyze the first output tuple and use that to infer types/column names to
> set up a result table with the appropriate table schema by invoking the DDL
> procedurally.
>
> Sorry I don't have a "magic wand" answer for you -- for the use cases we
> target, these sorts of setup costs often pay off in the long run, so that's
> the case we've optimized the design around. Let me know if there's anything
> else I can help with.
> Thanks,
> - Aaron
>
>
> On Wed, Jan 30, 2013 at 5:48 PM, Russell Jurney <[EMAIL PROTECTED]>wrote:
>
>> Aaron - is there a way to create a Kiji table from Pig? I'm in the habit
>> of not specifying schemas with Voldemort and MongoDB, just storing a Pig
>> relation and the schema is set in the store. If I can arrange that somehow,
>> I'm all over Kiji. Panthera is a fork :/
>>
>>
>> On Wed, Jan 30, 2013 at 3:20 PM, Aaron Kimball <[EMAIL PROTECTED]>wrote:
>>
>>> Hi ccleve,
>>>
>>> I'd definitely urge you to try out Kiji -- we who work on it think it's
>>> a pretty good fit for this specific use case. If you've got further
>>> questions about Kiji and how to use it, please send them to me, or ask the
>>> kiji user mailing list: http://www.kiji.org/getinvolved#Mailing_Lists
>>>
>>> cheers,
>>> - Aaron
>>>
>>>
>>> On Tue, Jan 29, 2013 at 3:24 PM, Doug Cutting <[EMAIL PROTECTED]>wrote:
>>>
>>>> Avro and Trevni files do not support record update or delete.
>>>>
>>>> For large changing datasets you might use Kiji (http://www.kiji.org/)
>>>> to store Avro data in HBase.
>>>>
>>>> Doug
>>>>
>>>> On Mon, Jan 28, 2013 at 12:00 PM, ccleve <[EMAIL PROTECTED]> wrote:
>>>> > I've gone through the documentation, but haven't been able to get a
>>>> definite
>>>> > answer: is Avro, or specifically Trevni, only for read-only data?
>>>> >
>>>> > Is it possible to update or delete records?
>>>> >
>>>> > If records can be deleted, is there any code that will merge row sets
>>>> to get
>>>> > rid of the unused space?
>>>> >
>>>> >
>>>> >
>>>>
>>>
>>>
>>
>>
>> --
>> Russell Jurney twitter.com/rjurney [EMAIL PROTECTED] datasyndrome.
>> com
>>
>
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB