Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Loading LZOs With Some JSON


Copy link to this message
-
Re: Loading LZOs With Some JSON
Well, it's not throwing me errors anymore. Now it's just discarding the
field. When I run it on two records where I've verified a field exists
in the json, I get:

Encountered Warning FIELD_DISCARDED_TYPE_CONVERSION_FAILED 2 time(s).

More specifically, my json is of the following form:

{"foo":0,"bar":"hi"}

On that, I'm running:

initial = LOAD 'some_file.lzo' USING
com.twitter.elephantbird.pig.store.LzoPigStorage('\\t') AS (col1, col2,
col3, json_data);
extracted = FOREACH initial GENERATE (chararray) json_data#'type' AS type;
dump extracted;

Which gives me the above warning along with:

()
()

I also tried it without the cast to chararray, but received the same
results. Should I be casting json_data as some other data type when I
load it initially? Seems by default it's cast to a bytearray when I
describe initial. Would that be a problem?

Thanks for all the help so far!

Eli

On 9/12/11 9:26 PM, Dmitriy Ryaboy wrote:
> Ah yeah that's my favorite thing about Pig maps (prior to pig 0.9,
> theoretically).
> The values are bytearrays. You are probably trying to treat them as strings.
>   You have to do stuff like this:
>
> x = foreach myrelation generate
>    (chararray) mymap#'foo' as foo,
>    (chararray) mymap#'bar' as bar;
>
>
> On Mon, Sep 12, 2011 at 11:54 AM, Eli Finkelshteyn<[EMAIL PROTECTED]>  wrote:
>
>> Hmmm, now it gets past my mention of the function, but when I run a dump on
>> generated information, I get:
>>
>> 2011-09-12 14:48:12,814 [main] ERROR org.apache.pig.tools.grunt.**Grunt -
>> ERROR 2997: Unable to recreate exception from backed error:
>> java.lang.ClassCastException: *org.apache.pig.data.**DataByteArray cannot
>> be cast to java.lang.String*
>>
>> Thanks for all the help so far!
>>
>> Eli
>>
>>
>> On 9/12/11 2:42 PM, Dmitriy Ryaboy wrote:
>>
>>> You also want json-simple-1.1.jar
>>>
>>>
>>> On Mon, Sep 12, 2011 at 10:46 AM, Eli Finkelshteyn<iefinkel@gmail.**com<[EMAIL PROTECTED]>
>>>> wrote:
>>>   Hmm, I'm loading up hadoop-lzo.*.jar, elephant-bird.*.jar, guava-*.jar,
>>>> and
>>>> piggybank.jar, and then trying to use that UDF, but getting the following
>>>> error:
>>>>
>>>> ERROR 2998: Unhandled internal error. org/json/simple/parser/**
>>>> ParseException
>>>>
>>>> java.lang.****NoClassDefFoundError: org/json/simple/parser/****
>>>> ParseException
>>>>         at java.lang.Class.forName0(****Native Method)
>>>>         at java.lang.Class.forName(Class.****java:247)
>>>>         at org.apache.pig.impl.****PigContext.resolveClassName(**
>>>> PigContext.java:426)
>>>>         at org.apache.pig.impl.****PigContext.****
>>>> instantiateFuncFromSpec(**
>>>> PigContext.java:456)
>>>>         at org.apache.pig.impl.****PigContext.****
>>>> instantiateFuncFromSpec(**
>>>> PigContext.java:508)
>>>>         at org.apache.pig.impl.****PigContext.****
>>>> instantiateFuncFromAlias(**
>>>> PigContext.java:531)
>>>>         at org.apache.pig.impl.****logicalLayer.parser.**
>>>> QueryParser.EvalFuncSpec(****QueryParser.java:5462)
>>>>         at org.apache.pig.impl.****logicalLayer.parser.**
>>>> QueryParser.BaseEvalSpec(****QueryParser.java:5291)
>>>>         at org.apache.pig.impl.****logicalLayer.parser.**
>>>> QueryParser.UnaryExpr(****QueryParser.java:5187)
>>>>         at org.apache.pig.impl.****logicalLayer.parser.**
>>>> QueryParser.CastExpr(****QueryParser.java:5133)
>>>>         at org.apache.pig.impl.****logicalLayer.parser.****QueryParser.**
>>>> MultiplicativeExpr(****QueryParser.java:5042)
>>>>         at org.apache.pig.impl.****logicalLayer.parser.**
>>>> QueryParser.AdditiveExpr(****QueryParser.java:4968)
>>>>         at org.apache.pig.impl.****logicalLayer.parser.**
>>>> QueryParser.InfixExpr(****QueryParser.java:4934)
>>>>         at org.apache.pig.impl.****logicalLayer.parser.****QueryParser.**
>>>> FlattenedGenerateItem(****QueryParser.java:4861)
>>>>         at org.apache.pig.impl.****logicalLayer.parser.****QueryParser.**
>>>> FlattenedGenerateItemList(****QueryParser.java:4747)
>>>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB