Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Avro >> mail # user >> Pig with Avro and HBase


+
Russell Jurney 2012-08-23, 03:37
Copy link to this message
-
Re: Pig with Avro and HBase
I am using Pig on Avro data files, and Avro in HBase.

Can you elaborate on what you mean by 'auto-load the schema'?  In the
sense that a big LOAD statement doesn't have to declare the schema?  I do
this with avro data files to some extent (with limitations).

A working implementation of
https://issues.apache.org/jira/browse/AVRO-1124 seems to be the way to go
for tracking a mapping from something like a Table or known file type to a
sequence of schemas (and the most recent schema).  Then a pig loader could
load from HBase using the most recent schema from a named schema group, or
read the same thing from files that represent the same schema group with
an avro file loader.
On 8/22/12 8:37 PM, "Russell Jurney" <[EMAIL PROTECTED]> wrote:
>
>Is anyone using Pig with Avro as the datatype in HBase? I want to
>auto-load the schema, and this seems the most direct way to do it.
>
>--
>Russell Jurney twitter.com/rjurney <http://twitter.com/rjurney>
>[EMAIL PROTECTED] datasyndrome.com <http://datasyndrome.com/>