|
|
-
AW: using multipe avro schemas in globbed files (schema merging)Zebeljan, Nebojsa 2012-07-26, 06:40
Hi Cheolsoo, hi Philipp,
we've already patched Stans original patch to pig-0.9.2-cdh4.0.1 and adjusted it to our needs. In detail we've removed in the method "org.apache.pig.piggybank.storage.avro.AvroStorageUtils.union(Schema, Schema)" the schema name validation, since in our case the schemas have always the same name. Code: // if (x.getName().equals(y.getName())) { // throw new RuntimeException("Union of two schemas of the same name is not supported"); // } @Chelsoo: When applying the "merge code" to the piggybank codebase, please consider if this check makes in general sense. By the way the patch works pretty good for us - Thanks to Stan! Regards, Nebo -----Ursprüngliche Nachricht----- Von: Cheolsoo Park [mailto:[EMAIL PROTECTED]] Gesendet: Mittwoch, 25. Juli 2012 23:18 An: [EMAIL PROTECTED] Betreff: Re: using multipe avro schemas in globbed files (schema merging) Hi Phillipp, Sure, I put PIG-2579 into my queue. I will start working on it shortly. Thanks, Cheolsoo On Wed, Jul 25, 2012 at 7:35 AM, Philipp Pahl <[EMAIL PROTECTED]>wrote: > Hi Cheolsoo, > > I saw that you integrated the "globs and commas" support into the pig > code. I was wondering if you are also planning to integrate the > multiple Avro schema support, which I would greatly appreciate. > > Thanks and regards > Philipp > > > On 07/17/2012 07:03 PM, Cheolsoo Park wrote: > >> Hi Markus, >> >> Thank you for sharing your problem. >> >> Looking at the PIG-2579 >> <https://issues.apache.org/**jira/browse/PIG-2579<https://issues.apac >> he.org/jira/browse/PIG-2579>>patch, >> it seems to try >> >> to address two issues at the same time: >> 1) Globs support >> 2) Multiple Avro schemas support >> >> I think that it's better to solve one issue at a time. In fact, there >> is another jira PIG-2492 >> <https://issues.apache.org/**jira/browse/PIG-2492<https://issues.apac >> he.org/jira/browse/PIG-2492>> >> that >> >> tries to address #1 particularly. Once >> PIG-2492<https://issues.**apache.org/jira/browse/PIG-**2492<https://i >> ssues.apache.org/jira/browse/PIG-2492>>is >> resolved, I >> >> think we can rebase/fix the >> PIG-2579 >> <https://issues.apache.org/**jira/browse/PIG-2579<https://issues.apac >> he.org/jira/browse/PIG-2579>> >> patch on top of >> >> that. >> >> I am happy to work on both jiras. Please let me know what you think. >> >> Thanks, >> Cheolsoo >> >> On Tue, Jul 17, 2012 at 4:26 AM, Markus Resch <[EMAIL PROTECTED] >> >wrote: >> >> Hey everyone, >>> >>> in the thread "Downgrade CDH4 to CDH3" of the cloudera mailing list >>> I talked about issues we had with pig while testing cdh4 and that we >>> had trouble in switching back to cdh3. After I figured out the >>> reason of our pig issue I tried to apply the patch >>> (https://issues.apache.org/**jira/browse/PIG-2579<https://issues.apa >>> che.org/jira/browse/PIG-2579>) to the cdh4 version of pig. Sadly >>> this was much harder then applying this particular patch to the cdh3 >>> version of pig before. Does anyone have this or a similar patch in a >>> way that is suitable for the cdh4 version of pig? I'm just asking >>> because doing work twice doesn't help anyone. If this work is >>> already >>> done: could this patch be attached to the PIG-2579-ticket as well? >>> >>> Thanks >>> >>> Markus >>> >>> >>> >>> -- >>> >>> >>> >>> >>> > |