|
|
+
Johannes Schwenk 2012-05-29, 12:35
+
Jonathan Coveney 2012-05-29, 17:42
-
Re: Verifying unordered output with PigUnitJohannes Schwenk 2012-05-30, 09:02
Hello again!
I don't have to sort the output in normal operation of my script, so I would rather not, as this prolongs running time unnecessarily... So I still have the problem that I cannot compare the unsorted output of the script to the expected one. I am doing this in PigUnit, so I had a look at org.apache.pig.pigunit.PigTest and the only option I could see is to override assertOutput and write a new version of readFile assuring that those functions return sorted records, which I thought to be not that elegant... Has nobody had this problem with PigUnit to date? Thanks! Am 29.05.2012 19:42, schrieb Jonathan Coveney: > Generally, sorting is the way to go. It's going to be difficult to get > around doing some sort of processing in order to make it easier to evaluate > equality. > > If you want something generally O(n) instead of O(n log n), you could > calculate the hashCode for every tuple then SUM it (which is algebraic), > and only in the case that these are not equal (exceedingly rare) would you > sort and directly do the comparison. > > 2012/5/29 Johannes Schwenk <[EMAIL PROTECTED]> > >> Hello all, >> >> I'd like to verify output from a pig script that does not sort its >> results prior to output. Thus the order of the tuples in the output is >> non-deterministic. I would rather not add sorting to my script, because >> I am potentially dealing with a lot of data here. As I have found >> PigLatin does not support conditional statements like "if PIG_UNIT_TEST >> do stepsA else do stepsB fi" - so this is also not an option (besides >> from having duplicate and differing logic for test and non-test runs!). >> >> So how could I do this? >> >> Greetings, >> Johannes Schwenk >> >> -- >> Softwareentwickler (Reporting) >> ________________________________________________________ >> >> ADITION technologies AG >> Schwarzwaldstraße 78b >> 79117 Freiburg >> >> http://www.adition.com >> >> T +49 / (0)761 / 88147 - 30 >> F +49 / (0)761 / 88147 - 77 >> SUPPORT +49 / (0)1805 - ADITION >> >> (Festnetzpreis 14 ct/min; Mobilfunkpreise maximal 42 ct/min) >> >> Eingetragen beim Amtsgericht Düsseldorf unter HRB 54076 >> Vorstände: Andreas Kleiser, Jörg Klekamp, Tihomir Perkovic, Marcus Schlüter >> Aufsichtsratsvorsitzender: Rechtsanwalt Daniel Raimer >> UStIDNr.: DE 218 858 434 >> >> > Johannes Schwenk -- Softwareentwickler (Reporting) ________________________________________________________ ADITION technologies AG Schwarzwaldstraße 78b 79117 Freiburg http://www.adition.com T +49 / (0)761 / 88147 - 30 F +49 / (0)761 / 88147 - 77 SUPPORT +49 / (0)1805 - ADITION (Festnetzpreis 14 ct/min; Mobilfunkpreise maximal 42 ct/min) Eingetragen beim Amtsgericht Düsseldorf unter HRB 54076 Vorstände: Andreas Kleiser, Jörg Klekamp, Tihomir Perkovic, Marcus Schlüter Aufsichtsratsvorsitzender: Rechtsanwalt Daniel Raimer UStIDNr.: DE 218 858 434 |