|
|
+
Johannes Schwenk 2012-08-23, 15:49
+
Cheolsoo Park 2012-08-23, 18:11
+
Cheolsoo Park 2012-08-23, 19:02
-
Re: AvroStorage load and store, schema with mapsJohannes Schwenk 2012-09-03, 11:16
Thank you very much!
I was confused because it seems to be ok to pass parameters to DEFINEd functions. If this does not work, it should be a syntax error trying to pass them anyway. Maybe a parser exception could be thrown? Thanks again! Johannes Am 23.08.2012 21:02, schrieb Cheolsoo Park: > Actually, I found it in Pig manual: > > If you need to use different constructor parameters for different calls to >> the function you will need to create multiple defines – one for each >> parameter set. > > > For example, this works: > > DEFINE AvroStorageNoParam >> org.apache.pig.piggybank.storage.avro.AvroStorage(); >> DEFINE AvroStorageWithParam >> org.apache.pig.piggybank.storage.avro.AvroStorage('schema', '{"type" : >> "map","values" : "string"}'); >> loaded_data = LOAD 'map.avro' USING *AvroStorageNoParam*; >> describe loaded_data; >> STORE loaded_data INTO 'output' USING *AvroStorageWithParam*; > > > Please see the usage section: > http://pig.apache.org/docs/r0.10.0/basic.html#define-udfs > > Thanks, > Cheolsoo > > On Thu, Aug 23, 2012 at 11:11 AM, Cheolsoo Park <[EMAIL PROTECTED]>wrote: > >> Hi Johannes, >> >> I was able to reproduce your error with the following Avro schema: >> >> { >>> "type" : "map", >>> "values" : "string" >>> } >> >> >> The issue is not in AvroStorage but in the DEFINE statement. >> >> DEFINE AvroStorage org.apache.pig.piggybank.storage.avro.AvroStorage(); >> >> >> AvroStorage has two constructors: one with no parameter and the other with >> parameters. To define output Avro schema, the second one must be used. But >> your DEFINE statement makes the first constructor be used always, resulting >> that output Avro schema is not set. If you remove the DEFINE statement and >> use the fully qualified name of AvroStorage, everything works. For example, >> >> loaded_data = LOAD 'map.avro' USING * >>> org.apache.pig.piggybank.storage.avro.AvroStorage.AvroStorage*(); >>> describe loaded_data; >>> STORE loaded_data INTO 'output' USING * >>> org.apache.pig.piggybank.storage.avro.AvroStorage*('schema', ' >>> { >>> "type" : "map", >>> "values" : "string" >>> } >>> '); >> >> >> Now the question is why DEFINE does not work here. >> >> Thanks, >> Cheolsoo >> >> >> On Thu, Aug 23, 2012 at 8:49 AM, Johannes Schwenk < >> [EMAIL PROTECTED]> wrote: >> >>> Hi all, >>> >>> I'm trying to execute the following pig script with pig-0.10.0 and yarn >>> (cdh4.0.0): >>> >>> -- >>> DEFINE AvroStorage org.apache.pig.piggybank.storage.avro.AvroStorage(); >>> loaded_data = LOAD '$input' USING AvroStorage(); >>> STORE loaded_data INTO '$output' USING AvroStorage('same', '$input'); >>> -- >>> >>> I call the pig this way: >>> >>> pig >>> >>> -Dpig.additional.jars=lib/piggybank.jar:lib/json-simple-1.1.jar:lib/avro-1.5.3.jar >>> -file script.pig -param input=input.avro -param output=output.avro >>> >>> The input.avro has the following schema: >>> >>> http://pastebin.com/ZWU6qLWx >>> >>> I always get >>> >>> <file script.pig, line 3, column 0> Output Location Validation Failed >>> for: 'xxx/output.avro' More info to follow: >>> Please provide schema for Map field! >>> Details at logfile: xxx/pig_1345735999390.log >>> >>> Log excerpt: >>> >>> Please provide schema for Map field! >>> at >>> >>> org.apache.pig.newplan.logical.rules.InputOutputFileValidator$InputOutputFileVisitor.visit(InputOutputFileValidator.java:75) >>> at >>> org.apache.pig.newplan.logical.relational.LOStore.accept(LOStore.java:77) >>> at >>> >>> org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:64) >>> at >>> >>> org.apache.pig.newplan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:66) >>> at >>> org.apache.pig.newplan.DepthFirstWalker.walk(DepthFirstWalker.java:53) >>> at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:50) >>> at >>> >>> org.apache.pig.newplan.logical.rules.InputOutputFileValidator.validate(InputOutputFileValidator.java:45) >>> at Johannes Schwenk Softwareentwickler (Reporting) ________________________________________________________ ADITION technologies AG Schwarzwaldstraße 78b 79117 Freiburg http://www.adition.com T +49 / (0)761 / 88147 - 30 F +49 / (0)761 / 88147 - 77 SUPPORT +49 / (0)1805 - ADITION (Festnetzpreis 14 ct/min; Mobilfunkpreise maximal 42 ct/min) Eingetragen beim Amtsgericht Düsseldorf unter HRB 54076 Vorstände: Andreas Kleiser, Jörg Klekamp, Tihomir Perkovic, Marcus Schlüter Aufsichtsratsvorsitzender: Rechtsanwalt Daniel Raimer UStIDNr.: DE 218 858 434 |