Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # dev >> Re: Schema mismatch for files with changing avro schemas


Copy link to this message
-
Re: Schema mismatch for files with changing avro schemas
No AvroStorage doesn't currently support projection push-down. Looking at
the Avro integration code though, this seems feasible.

On Thu, Apr 5, 2012 at 11:27 AM, Jonathan Coveney <[EMAIL PROTECTED]>wrote:

> For the Avro people, does AvroStorage support column pruning?
>
> 2012/4/5 Stan Rosenberg <[EMAIL PROTECTED]>
>
> > AFAIK, by default AvroStorage enforces that all input files have
> > exactly the same schema.  I've submitted a patch to improve
> > this somewhat by allowing different input schemas so long as a union
> > schema can be derived; e.g., say schema 1 contains field 'foo' which
> > is not
> > in schema 2, and schema 2 contains 'bar' which is not in schema 1,
> > then the resulting schema will have both fields, etc.
> > (The patch is here: https://issues.apache.org/jira/browse/PIG-2579.)
> >
> > In your case, you seem to have different schemas where the difference
> > is actual in the fields which are never used inside pig.
> > That's an entirely new use case, afaik.  The union schema is one
> > workaround.  However, it might be better to specify these unused
> > fields
> > and preclude them from validation; perhaps running validation only
> > against those fields which are specified in the pig script.
> >
> > Best,
> >
> > stan
> >
> > On Thu, Apr 5, 2012 at 8:58 AM, Philipp <[EMAIL PROTECTED]> wrote:
> > > Hi list,
> > >
> > > if I run pig over several avro files with different schemas I get a
> > schema
> > > mismatch message, even if the schema has only changed marginally in a
> > field
> > > that I'm not even using in that particular pig job.
> > > I'm wondering if it would be possible to resolve the mismatch, eg. as
> > > suggested in:
> > > https://avro.apache.org/docs/current/spec.html#Schema+Resolution
> > >
> > > Regards, Philipp
> > >
> > >
> >
>

--
*Note that I'm no longer using my Yahoo! email address. Please email me at
[EMAIL PROTECTED] going forward.*