Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Pig / Map data type for keys / String cannot be cast to integer


Copy link to this message
-
RE: Pig / Map data type for keys / String cannot be cast to integer
Sure. That works too. Probably, it's a better idea to return a boolean
instead of a String.

Thanks,
Santhosh

-----Original Message-----
From: Mathias Fryde [mailto:[EMAIL PROTECTED]]
Sent: Thursday, March 19, 2009 1:35 AM
To: [EMAIL PROTECTED]
Subject: Re: Pig / Map data type for keys / String cannot be cast to
integer

Hi,
Thanks for your answer, I manage to build an UDF to make my Filter work
:

My script :
register './myudf.jar'
source = LOAD 'thesource' USING PigStorage('|')  AS ( themap: map []);
A = FILTER source BY org.apache.pig.myudf.ConditionFilter(themap#'key01'
,
'Location' ) ;
DUMP A;

the Udf Code :

package org.apache.pig.myudf;

import java.io.IOException;
import java.lang.*;
import java.util.Date;
import java.text.*;
import org.apache.pig.EvalFunc;
import org.apache.pig.data.Tuple;
import org.apache.pig.impl.util.WrappedIOException;

public class ConditionFilter extends EvalFunc<String>
{

    public String exec(Tuple input) throws IOException {
        if (input == null || input.size() == 0)
            return null;
        try{
            Object obj = input.get(0);
            String ourresult = "FALSE";
            if (obj instanceof String) {
                String ourinput = (String) obj;
                String ourcondition = (String) input.get(1);
                if (ourinput.equals(ourcondition))
                    ourresult = "TRUE";
            }

        return ourresult ;

        }catch(Exception e){
            throw WrappedIOException.wrap("Caught exception processing
input
row ", e);
        }
    }
}
regards,
Mathias.

2009/3/18 Santhosh Srinivasan <[EMAIL PROTECTED]>

> The value 35 for key01 on the second row is being interpreted as an
> integer. The issue is captured in PIG-724
> (https://issues.apache.org/jira/browse/PIG-724). The only workaround I
> can think of right now, is to ensure consistency in the values for the
> keys, i.e., knock of numeric values for key key01 in your data.
>
> Thanks,
> Santhosh
>
> -----Original Message-----
> From: Mathias Fryde [mailto:[EMAIL PROTECTED]]
> Sent: Wednesday, March 18, 2009 5:47 AM
> To: [EMAIL PROTECTED]
> Subject: Re: Pig / Map data type for keys / String cannot be cast to
> integer
>
> Hello here is a simple example.
> regards,
>
> [key01#Location,key02#value01]
> [key01#35,key02#value01]
> [key01#Location,key02#value01]
> [key01#Zone,key02#value01]
> [key01#toto,key02#value01]
> [key01#0,key02#value01]
> [key02#value01]
> [key01#Location,key02#value01]
>
>
>
>
> 2009/3/18 Mridul Muralidharan <[EMAIL PROTECTED]>
>
> >
> > I am not sure what the problem is, but my guess would be that the
map
> is
> > getting loaded as (chararray, int) - and the comparison is failing
cos
> of it
> > ?
> > Someone from pig team can confirm if this is a known bug/issue (iirc
> it is
> > not).
> >
> > If possible, can you post a snippet of the input file ?
> >
> > Thanks,
> > Mridul
> >
> >
> >
> > Mathias Fryde wrote:
> >
> >> Hello,
> >> I have a  map whit a key that have diffrent values like *:
Location,
> Zone,
> >> 35, Place ....
> >> *When I execute my script with a filter on this key I a have a Cast
> >> error.*
> >>
> >> source = LOAD 'thesource' USING PigStorage('|')  AS ( themap: map
> []);
> >> A = FILTER source BY themap#'key01' == 'Location' ;
> >> DUMP **A**;*
> >>
> >> I also tried :
> >> *A = FILTER source BY (chararray)themap#'key01' == 'Location' ;*
> >> *A = FILTER source BY (chararray)themap#'key01' == 'Location' ;
> >> **A = FILTER source BY **themap#'data:Resource:id' matches
> '.***Location**
> >> .*';
> >> *but still have the same error ... *
> >>
> >> *
> >>
> >>   - *anyone as a solution How do do it? *
> >>   - *Why Pig down't use default string? *
> >>   - *Do Pig define datatypes alone ?*
> >>
> >>
> >>
> >> org.apache.pig.tools.grunt.Grunt -
> >> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066:
> Unable to
> >> open iterator for alias A
> >>        at org.apache.pig.PigServer.openIterator(PigServer.java:438)
org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:359)
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptPar
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java
ERROR
ERROR
org.apache.pig.backend.local.executionengine.LocalExecutionEngine.execut
org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:695)
be
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOp
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOp
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOp
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOp
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOper
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOp
org.apache.pig.backend.local.executionengine.LocalPigLauncher.launchPig(
org.apache.pig.backend.local.executionengine.LocalExecutionEngine.execut