Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> Verifying unordered output with PigUnit


+
Johannes Schwenk 2012-05-29, 12:35
+
Jonathan Coveney 2012-05-29, 17:42
Copy link to this message
-
Re: Verifying unordered output with PigUnit
Hello again!

I don't have to sort the output in normal operation of my script, so I
would rather not, as this prolongs running time unnecessarily...

So I still have the problem that I cannot compare the unsorted output of
the script to the expected one. I am doing this in PigUnit, so I had a
look at org.apache.pig.pigunit.PigTest and the only option I could see
is to override assertOutput and write a new version of readFile assuring
that those functions return sorted records, which I thought to be not
that elegant...

Has nobody had this problem with PigUnit to date?

Thanks!

Am 29.05.2012 19:42, schrieb Jonathan Coveney:
> Generally, sorting is the way to go. It's going to be difficult to get
> around doing some sort of processing in order to make it easier to evaluate
> equality.
>
> If you want something generally O(n) instead of O(n log n), you could
> calculate the hashCode for every tuple then SUM it (which is algebraic),
> and only in the case that these are not equal (exceedingly rare) would you
> sort and directly do the comparison.
>
> 2012/5/29 Johannes Schwenk <[EMAIL PROTECTED]>
>
>> Hello all,
>>
>> I'd like to verify output from a pig script that does not sort its
>> results prior to output. Thus the order of the tuples in the output is
>> non-deterministic. I would rather not add sorting to my script, because
>> I am potentially dealing with a lot of data here. As I have found
>> PigLatin does not support conditional statements like "if PIG_UNIT_TEST
>> do stepsA else do stepsB fi" - so this is also not an option (besides
>> from having duplicate and differing logic for test and non-test runs!).
>>
>> So how could I do this?
>>
>> Greetings,
>> Johannes Schwenk
>>
>> --
>> Softwareentwickler (Reporting)
>> ________________________________________________________
>>
>> ADITION technologies AG
>> Schwarzwaldstraße 78b
>> 79117 Freiburg
>>
>> http://www.adition.com
>>
>> T +49 / (0)761 / 88147 - 30
>> F +49 / (0)761 / 88147 - 77
>> SUPPORT +49  / (0)1805 - ADITION
>>
>> (Festnetzpreis 14 ct/min; Mobilfunkpreise maximal 42 ct/min)
>>
>> Eingetragen beim Amtsgericht Düsseldorf unter HRB 54076
>> Vorstände: Andreas Kleiser, Jörg Klekamp, Tihomir Perkovic, Marcus Schlüter
>> Aufsichtsratsvorsitzender: Rechtsanwalt Daniel Raimer
>> UStIDNr.: DE 218 858 434
>>
>>
>

Johannes Schwenk

--
Softwareentwickler (Reporting)
________________________________________________________

ADITION technologies AG
Schwarzwaldstraße 78b
79117 Freiburg

http://www.adition.com

T +49 / (0)761 / 88147 - 30
F +49 / (0)761 / 88147 - 77
SUPPORT +49  / (0)1805 - ADITION

(Festnetzpreis 14 ct/min; Mobilfunkpreise maximal 42 ct/min)

Eingetragen beim Amtsgericht Düsseldorf unter HRB 54076
Vorstände: Andreas Kleiser, Jörg Klekamp, Tihomir Perkovic, Marcus Schlüter
Aufsichtsratsvorsitzender: Rechtsanwalt Daniel Raimer
UStIDNr.: DE 218 858 434