|
|
+
Lex H 2012-11-22, 01:54
+
Ruslan Al-Fakikh 2012-11-22, 12:11
-
Re: How to filter by pig datatype?pablomar 2012-11-22, 17:48
did you try with a filter function ?
something like: import java.io.IOException; import org.apache.pig.FilterFunc; import org.apache.pig.data.Tuple; import org.apache.pig.impl.util.WrappedIOException; public class IsMap extends FilterFunc { public Boolean exec(Tuple input) throws IOException { if (input == null || input.size() == 0) return null; try { return(input.get(0) instanceof java.util.Map); } catch(Exception e) { throw WrappedIOException.wrap("ouch!", e); } } } and then: filtered = FILTER some_data BY IsMap(some_variable); PS: I didn't try it with your data On Wed, Nov 21, 2012 at 8:54 PM, Lex H <[EMAIL PROTECTED]> wrote: > Attached is a tiny testcase illustrating my problem. > > What I would like to know is how to filter by Pig datatype. > e.g. something like: > filtered = FILTER some_data BY some_variable IS_MAP_TYPE; > > Can anyone advise if this can be accomplished with Pig? > > We have a field that is sometimes a 'map' sometimes a chararray. > > Doing something like the following statement fails, presumable because > it's trying to a key-value lookup on something that's not a 'map'. > > -- json#'data' is sometimes a map, sometimes not. > trivias = FOREACH data GENERATE json#'data'#'trivia' AS trivia:charray; > > This has come about from us working with JSON data with Pig via Elephant > Bird's JsonLoader. > > Thanks, > > Lex. > +
Lex H 2012-11-22, 22:54
+
pablomar 2012-11-22, 23:19
|