|
|
+
Johannes Schwenk 2012-05-21, 16:36
+
Jonathan Coveney 2012-05-21, 17:11
-
Re: UDF FilterFunc and logical ORJohannes Schwenk 2012-05-22, 16:37
Thank you for your quick suggestions!
- I am now using local mode - good point! - I know of builtin matches, the CONTAINS filter was just to get into programming UDFS... - Whatever I do the problem persists. I tried: * turning off all optimizations (-t All) : no effect * reordering the statements : the outcome contains still only the matching tuples to the lhs of the OR * using different data (just in case...) : no effect * finally counted how many times the exec() function gets called processing the script... : exactly *six times* - each for every record! That last observation leads me to believe that this is a bug!? The exec function should be called at least *ten times* I think. Du you have any suggestions on how to verify this? Greetings Am 21.05.2012 19:11, schrieb Jonathan Coveney: > Not sure why it is failing... though I will mention two things. 1) you > should use local mode if possible, especially just to test UDFs :) 2) you > could use the builtin matches function to achieve this (ie matches > '.*keyword.*') > > Besides that it is odd indeed, and I'd have to dig in more. > > 2012/5/21 Johannes Schwenk <[EMAIL PROTECTED]> > >> Hello List, >> >> I am using Clouderas distribution (cdh3u3) which comes with pig-0.8.1. >> >> I have written a UDF extending FilterFunc that checks if the provided >> string is contained within the specified column of the current tuple: >> http://pastebin.com/Uwje7v1V >> >> I have also written some TestCases: >> http://pastebin.com/uA4LHB4Q >> >> The odd thing is, that only TestCase testFilteringClusterWithOR1 fails >> because the result has not the expected length of 3 but is of length 2 >> instead (line 177 in http://pastebin.com/Uwje7v1V). After a lot of >> investigating I still can not find out why testFilteringCluster and >> testFilteringClusterWithOR2 succeed but not testFilteringClusterWithOR1. >> Is there a special prerequisite for making my FilterFunc usabel within >> OR ? Maybe I have missed something very obvious... Please help me figure >> this out! >> >> Greetings, >> Johannes Schwenk >> >> -- >> Softwareentwickler (Reporting) >> ________________________________________________________ >> >> ADITION technologies AG >> Schwarzwaldstraße 78b >> 79117 Freiburg >> >> http://www.adition.com >> >> T +49 / (0)761 / 88147 - 30 >> F +49 / (0)761 / 88147 - 77 >> SUPPORT +49 / (0)1805 - ADITION >> >> (Festnetzpreis 14 ct/min; Mobilfunkpreise maximal 42 ct/min) >> >> Eingetragen beim Amtsgericht Düsseldorf unter HRB 54076 >> Vorstände: Andreas Kleiser, Jörg Klekamp, Tihomir Perkovic, Marcus Schlüter >> Aufsichtsratsvorsitzender: Rechtsanwalt Daniel Raimer >> UStIDNr.: DE 218 858 434 >> >> > Johannes Schwenk -- Softwareentwickler (Reporting) ________________________________________________________ ADITION technologies AG Schwarzwaldstraße 78b 79117 Freiburg http://www.adition.com T +49 / (0)761 / 88147 - 30 F +49 / (0)761 / 88147 - 77 SUPPORT +49 / (0)1805 - ADITION (Festnetzpreis 14 ct/min; Mobilfunkpreise maximal 42 ct/min) Eingetragen beim Amtsgericht Düsseldorf unter HRB 54076 Vorstände: Andreas Kleiser, Jörg Klekamp, Tihomir Perkovic, Marcus Schlüter Aufsichtsratsvorsitzender: Rechtsanwalt Daniel Raimer UStIDNr.: DE 218 858 434 +
Jonathan Coveney 2012-05-22, 19:26
+
Johannes Schwenk 2012-05-23, 09:42
+
Jonathan Coveney 2012-05-23, 16:20
+
Johannes Schwenk 2012-05-24, 12:54
+
Jonathan Coveney 2012-05-24, 16:55
+
Alan Gates 2012-05-24, 17:15
|