Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Pig / Map data type for keys / String cannot be cast to integer


Copy link to this message
-
RE: Pig / Map data type for keys / String cannot be cast to integer
Sure. That works too. Probably, it's a better idea to return a boolean
instead of a String.

Thanks,
Santhosh

-----Original Message-----
From: Mathias Fryde [mailto:[EMAIL PROTECTED]]
Sent: Thursday, March 19, 2009 1:35 AM
To: [EMAIL PROTECTED]
Subject: Re: Pig / Map data type for keys / String cannot be cast to
integer

Hi,
Thanks for your answer, I manage to build an UDF to make my Filter work
:

My script :
register './myudf.jar'
source = LOAD 'thesource' USING PigStorage('|')  AS ( themap: map []);
A = FILTER source BY org.apache.pig.myudf.ConditionFilter(themap#'key01'
,
'Location' ) ;
DUMP A;

the Udf Code :

package org.apache.pig.myudf;

import java.io.IOException;
import java.lang.*;
import java.util.Date;
import java.text.*;
import org.apache.pig.EvalFunc;
import org.apache.pig.data.Tuple;
import org.apache.pig.impl.util.WrappedIOException;

public class ConditionFilter extends EvalFunc<String>
{

    public String exec(Tuple input) throws IOException {
        if (input == null || input.size() == 0)
            return null;
        try{
            Object obj = input.get(0);
            String ourresult = "FALSE";
            if (obj instanceof String) {
                String ourinput = (String) obj;
                String ourcondition = (String) input.get(1);
                if (ourinput.equals(ourcondition))
                    ourresult = "TRUE";
            }

        return ourresult ;

        }catch(Exception e){
            throw WrappedIOException.wrap("Caught exception processing
input
row ", e);
        }
    }
}
regards,
Mathias.

2009/3/18 Santhosh Srinivasan <[EMAIL PROTECTED]>

> The value 35 for key01 on the second row is being interpreted as an
> integer. The issue is captured in PIG-724
> (https://issues.apache.org/jira/browse/PIG-724). The only workaround I
> can think of right now, is to ensure consistency in the values for the
> keys, i.e., knock of numeric values for key key01 in your data.
>
> Thanks,
> Santhosh
>
> -----Original Message-----
> From: Mathias Fryde [mailto:[EMAIL PROTECTED]]
> Sent: Wednesday, March 18, 2009 5:47 AM
> To: [EMAIL PROTECTED]
> Subject: Re: Pig / Map data type for keys / String cannot be cast to
> integer
>
> Hello here is a simple example.
> regards,
>
> [key01#Location,key02#value01]
> [key01#35,key02#value01]
> [key01#Location,key02#value01]
> [key01#Zone,key02#value01]
> [key01#toto,key02#value01]
> [key01#0,key02#value01]
> [key02#value01]
> [key01#Location,key02#value01]
>
>
>
>
> 2009/3/18 Mridul Muralidharan <[EMAIL PROTECTED]>
>
> >
> > I am not sure what the problem is, but my guess would be that the
map
> is
> > getting loaded as (chararray, int) - and the comparison is failing
cos
> of it
> > ?
> > Someone from pig team can confirm if this is a known bug/issue (iirc
> it is
> > not).
> >
> > If possible, can you post a snippet of the input file ?
> >
> > Thanks,
> > Mridul
> >
> >
> >
> > Mathias Fryde wrote:
> >
> >> Hello,
> >> I have a  map whit a key that have diffrent values like *:
Location,
> Zone,
> >> 35, Place ....
> >> *When I execute my script with a filter on this key I a have a Cast
> >> error.*
> >>
> >> source = LOAD 'thesource' USING PigStorage('|')  AS ( themap: map
> []);
> >> A = FILTER source BY themap#'key01' == 'Location' ;
> >> DUMP **A**;*
> >>
> >> I also tried :
> >> *A = FILTER source BY (chararray)themap#'key01' == 'Location' ;*
> >> *A = FILTER source BY (chararray)themap#'key01' == 'Location' ;
> >> **A = FILTER source BY **themap#'data:Resource:id' matches
> '.***Location**
> >> .*';
> >> *but still have the same error ... *
> >>
> >> *
> >>
> >>   - *anyone as a solution How do do it? *
> >>   - *Why Pig down't use default string? *
> >>   - *Do Pig define datatypes alone ?*
> >>
> >>
> >>
> >> org.apache.pig.tools.grunt.Grunt -
> >> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066:
> Unable to
> >> open iterator for alias A
> >>        at org.apache.pig.PigServer.openIterator(PigServer.java:438)
org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:359)
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptPar
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java
ERROR
ERROR
org.apache.pig.backend.local.executionengine.LocalExecutionEngine.execut
org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:695)
be
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOp
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOp
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOp
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOp
org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOper
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOp
org.apache.pig.backend.local.executionengine.LocalPigLauncher.launchPig(
org.apache.pig.backend.local.executionengine.LocalExecutionEngine.execut
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB