Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - using multipe avro schemas in globbed files (schema merging)


Copy link to this message
-
Re: using multipe avro schemas in globbed files (schema merging)
Cheolsoo Park 2012-07-26, 16:51
Hi Nebo,

Thank you for your suggestion. I will keep it in mind.

Cheolsoo

On Wed, Jul 25, 2012 at 11:40 PM, Zebeljan, Nebojsa <
[EMAIL PROTECTED]> wrote:

> Hi Cheolsoo, hi Philipp,
> we've already patched Stans original patch to pig-0.9.2-cdh4.0.1 and
> adjusted it to our needs.
>
> In detail we've removed in the method
> "org.apache.pig.piggybank.storage.avro.AvroStorageUtils.union(Schema,
> Schema)" the schema name validation, since in our case the schemas have
> always the same name.
>
> Code:
> //      if (x.getName().equals(y.getName())) {
> //              throw new RuntimeException("Union of two schemas of the
> same name is not supported");
> //      }
>
> @Chelsoo: When applying the "merge code" to the piggybank codebase, please
> consider if this check makes in general sense.
>
> By the way the patch works pretty good for us - Thanks to Stan!
>
> Regards,
> Nebo
>
> -----Urspr√ľngliche Nachricht-----
> Von: Cheolsoo Park [mailto:[EMAIL PROTECTED]]
> Gesendet: Mittwoch, 25. Juli 2012 23:18
> An: [EMAIL PROTECTED]
> Betreff: Re: using multipe avro schemas in globbed files (schema merging)
>
> Hi Phillipp,
>
> Sure, I put PIG-2579 into my queue. I will start working on it shortly.
>
> Thanks,
> Cheolsoo
>
> On Wed, Jul 25, 2012 at 7:35 AM, Philipp Pahl <[EMAIL PROTECTED]
> >wrote:
>
> > Hi Cheolsoo,
> >
> > I saw that you integrated the "globs and commas" support into the pig
> > code. I was wondering if you are also planning to integrate the
> > multiple Avro schema support, which I would greatly appreciate.
> >
> > Thanks and regards
> > Philipp
> >
> >
> > On 07/17/2012 07:03 PM, Cheolsoo Park wrote:
> >
> >> Hi Markus,
> >>
> >> Thank you for sharing your problem.
> >>
> >> Looking at the PIG-2579
> >> <https://issues.apache.org/**jira/browse/PIG-2579<https://issues.apac
> >> he.org/jira/browse/PIG-2579>>patch,
> >> it seems to try
> >>
> >> to address two issues at the same time:
> >> 1) Globs support
> >> 2) Multiple Avro schemas support
> >>
> >> I think that it's better to solve one issue at a time. In fact, there
> >> is another jira PIG-2492
> >> <https://issues.apache.org/**jira/browse/PIG-2492<https://issues.apac
> >> he.org/jira/browse/PIG-2492>>
> >> that
> >>
> >> tries to address #1 particularly. Once
> >> PIG-2492<https://issues.**apache.org/jira/browse/PIG-**2492<https://i
> >> ssues.apache.org/jira/browse/PIG-2492>>is
> >> resolved, I
> >>
> >> think we can rebase/fix the
> >> PIG-2579
> >> <https://issues.apache.org/**jira/browse/PIG-2579<https://issues.apac
> >> he.org/jira/browse/PIG-2579>>
> >> patch on top of
> >>
> >> that.
> >>
> >> I am happy to work on both jiras. Please let me know what you think.
> >>
> >> Thanks,
> >> Cheolsoo
> >>
> >> On Tue, Jul 17, 2012 at 4:26 AM, Markus Resch <[EMAIL PROTECTED]
> >> >wrote:
> >>
> >>  Hey everyone,
> >>>
> >>> in the thread "Downgrade CDH4 to CDH3" of the cloudera mailing list
> >>> I talked about issues we had with pig while testing cdh4 and that we
> >>> had trouble in switching back to cdh3. After I figured out the
> >>> reason of our pig issue I tried to apply the patch
> >>> (https://issues.apache.org/**jira/browse/PIG-2579<https://issues.apa
> >>> che.org/jira/browse/PIG-2579>) to the cdh4 version of pig. Sadly
> >>> this was much harder then applying this particular patch to the cdh3
> >>> version of pig before. Does anyone have this or a similar patch in a
> >>> way that is suitable for the cdh4 version of pig? I'm just asking
> >>> because doing work twice doesn't help anyone. If this work is
> >>> already
> >>> done: could this patch be attached to the PIG-2579-ticket as well?
> >>>
> >>> Thanks
> >>>
> >>> Markus
> >>>
> >>>
> >>>
> >>> --
> >>>
> >>>
> >>>
> >>>
> >>>
> >
>