Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Override input schema in AvroStorage


+
Enns, Steven 2013-04-25, 23:01
Copy link to this message
-
Re: Override input schema in AvroStorage
Hi Steven,

The new AvroStorage will let you specify the input schema:
https://issues.apache.org/jira/browse/PIG-3015

In fact, somebody made the same request in a comment of the jira that I am
copying and pasting below:

Furthermore, we occasionally have issues with pig jobs picking the old
> schema when we have a schema update. Manually specifying the schema would
> fix this and give us more flexibility in defining the data we want pig to
> pull from a file.
This jira is work in progress, but hopefully it will be in next major
released.

Thanks,
Cheolsoo

On Sat, Apr 27, 2013 at 3:24 PM, Enns, Steven <[EMAIL PROTECTED]> wrote:

> Resending now that I am subscribed :)
>
> On 4/25/13 4:01 PM, "Enns, Steven" <[EMAIL PROTECTED]> wrote:
>
> >Hi everyone,
> >
> >I would like to override the input schema in AvroStorage to make a pig
> >script robust to schema evolution.  For example, suppose a new field is
> >added to an avro schema with a default value of null.  If the input to a
> >pig script using this field includes both old and new data, AvroStorage
> >will merge the input schemas from the old and new data.  However, if the
> >input includes only old data, the new schema will not be available to
> >AvroStorage and pig will fail to interpret the script with an error such
> >as "projected field [newField] does not exist in schema".  If AvroStorage
> >accepted an input schema, the script would be valid for both the new and
> >old data.  Is there any plan to implement this?
> >
> >Thanks,
> >Steve
> >
>
>
+
Enns, Steven 2013-05-01, 17:29
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB