I am using Pig on Avro data files, and Avro in HBase.
Can you elaborate on what you mean by 'auto-load the schema'? In the sense that a big LOAD statement doesn't have to declare the schema? I do this with avro data files to some extent (with limitations).
A working implementation of https://issues.apache.org/jira/browse/AVRO-1124 seems to be the way to go for tracking a mapping from something like a Table or known file type to a sequence of schemas (and the most recent schema). Then a pig loader could load from HBase using the most recent schema from a named schema group, or read the same thing from files that represent the same schema group with an avro file loader. On 8/22/12 8:37 PM, "Russell Jurney" <[EMAIL PROTECTED]> wrote: > >Is anyone using Pig with Avro as the datatype in HBase? I want to >auto-load the schema, and this seems the most direct way to do it. > >-- >Russell Jurney twitter.com/rjurney <http://twitter.com/rjurney> >[EMAIL PROTECTED] datasyndrome.com <http://datasyndrome.com/>
All projects made searchable here are trademarks of the Apache Software Foundation.
Service operated by Sematext