Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> IOException appearing during dump but not illustrate


Copy link to this message
-
Re: IOException appearing during dump but not illustrate

That looks to have worked. Thanks.

On Wed, Dec 08, 2010 at 02:04:07PM -0800, Dmitriy Ryaboy wrote:
> Try explicitly casting argMap#'s' to a chararray?
>
>
> On Wed, Dec 8, 2010 at 1:53 PM, Kris Coward <[EMAIL PROTECTED]> wrote:
>
> > Hi,
> >
> > I've recently gotten stumped by a problem where my attempts to dump the
> > relations produced by a GROUP command give the following error (though
> > illustrating the same relation works fine):
> >
> > java.io.IOException: Type mismatch in key from map: expected
> > org.apache.pig.impl.io.NullableBytesWritable, recieved
> > org.apache.pig.impl.io.NullableText
> >        at
> > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:807)
> >        at
> > org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:466)
> >        at
> > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:108)
> > .
> > .
> > .
> >
> > for a little background, the relation that's failing is called y5, and
> > is produced by the following string of commands (in grunt):
> >
> > y2 = foreach y1 generate $0 as timestamp, myudfs.httpArgParse($1) as
> > argMap;
> > y3 = foreach y2 generate argMap#'s' as uid, timestamp as timestamp;
> > y4 = FILTER y3 BY (uid is not null);
> > y5 = GROUP y4 BY uid;
> >
> > and to get an idea what sort of data is involved, ILLUSTRATE y4 yields:
> >
> >
> > -----------------------------------------------------------------------------------------------------
> > | y1     | timestamp: int | args: bag({tuple_of_tokens: (token:
> > chararray)})                        |
> >
> > -----------------------------------------------------------------------------------------------------
> > |        | 1265950806     | {(s=1381688313),
> > (u=F68FFA1F655FDF494ABA520D95E1D99E), (ts=1265950805)} |
> >
> > -----------------------------------------------------------------------------------------------------
> >
> > -----------------------------------------------------------------------------------------------
> > | y2     | timestamp: int | argMap: map
> >                   |
> >
> > -----------------------------------------------------------------------------------------------
> > |        | 1265950806     | {u=F68FFA1F655FDF494ABA520D95E1D99E,
> > ts=1265950805, s=1381688313} |
> >
> > -----------------------------------------------------------------------------------------------
> > --------------------------------------------
> > | y3     | uid: bytearray | timestamp: int |
> > --------------------------------------------
> > |        | 1381688313     | 1265950806     |
> > --------------------------------------------
> > --------------------------------------------
> > | y4     | uid: bytearray | timestamp: int |
> > --------------------------------------------
> > |        | 1381688313     | 1265950806     |
> > --------------------------------------------
> >
> > The same problem was also produced when the FILTER command was omitted,
> > and the relevant chunk of code in myudfs.httpArgParse is:
> >
> >    StringTokenizer tok = new StringTokenizer((String)pair, "=", false);
> >    if (tok.hasMoreTokens() ) {
> >    String oKey = tok.nextToken();
> >        if (tok.hasMoreTokens() ) {
> >            Object oValue = tok.nextToken();
> >            output.put(oKey, oValue);
> >        } else {
> >            output.put(oKey, null);
> >        }
> >    }
> >
> > If anyone has any insight how I could get this to work, that'd really
> > help me out.
> >
> > Thanks,
> > Kris
> >
> > P.S. For those who remember my earlier post about getting httpArgParse
> > to compile, I took the advice to ditch the InternalMap in favour of a
> > HashMap<String,Object>
> >
> > --
> > Kris Coward                                     http://unripe.melon.org/
> > GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3
> >

--
Kris Coward http://unripe.melon.org/
GPG Fingerprint: 2BF3 957D 310A FEEC 4733  830E 21A4 05C7 1FEB 12B3
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB