I am using Pig on Avro data files, and Avro in HBase.
Can you elaborate on what you mean by 'auto-load the schema'? In the
sense that a big LOAD statement doesn't have to declare the schema? I do
this with avro data files to some extent (with limitations).
A working implementation of
https://issues.apache.org/jira/browse/AVRO-1124 seems to be the way to go
for tracking a mapping from something like a Table or known file type to a
sequence of schemas (and the most recent schema). Then a pig loader could
load from HBase using the most recent schema from a named schema group, or
read the same thing from files that represent the same schema group with
an avro file loader.
On 8/22/12 8:37 PM, "Russell Jurney" <[EMAIL PROTECTED]> wrote:
>Is anyone using Pig with Avro as the datatype in HBase? I want to
>auto-load the schema, and this seems the most direct way to do it.
>Russell Jurney twitter.com/rjurney <http://twitter.com/rjurney>
>[EMAIL PROTECTED] datasyndrome.com <http://datasyndrome.com/>