Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Avro >> mail # user >> Pig with Avro and HBase


+
Russell Jurney 2012-08-23, 03:37
Copy link to this message
-
Re: Pig with Avro and HBase
I am using Pig on Avro data files, and Avro in HBase.

Can you elaborate on what you mean by 'auto-load the schema'?  In the
sense that a big LOAD statement doesn't have to declare the schema?  I do
this with avro data files to some extent (with limitations).

A working implementation of
https://issues.apache.org/jira/browse/AVRO-1124 seems to be the way to go
for tracking a mapping from something like a Table or known file type to a
sequence of schemas (and the most recent schema).  Then a pig loader could
load from HBase using the most recent schema from a named schema group, or
read the same thing from files that represent the same schema group with
an avro file loader.
On 8/22/12 8:37 PM, "Russell Jurney" <[EMAIL PROTECTED]> wrote:
>
>Is anyone using Pig with Avro as the datatype in HBase? I want to
>auto-load the schema, and this seems the most direct way to do it.
>
>--
>Russell Jurney twitter.com/rjurney <http://twitter.com/rjurney>
>[EMAIL PROTECTED] datasyndrome.com <http://datasyndrome.com/>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB