Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Loading LZOs With Some JSON


Copy link to this message
-
Re: Loading LZOs With Some JSON
Well, it's not throwing me errors anymore. Now it's just discarding the
field. When I run it on two records where I've verified a field exists
in the json, I get:

Encountered Warning FIELD_DISCARDED_TYPE_CONVERSION_FAILED 2 time(s).

More specifically, my json is of the following form:

{"foo":0,"bar":"hi"}

On that, I'm running:

initial = LOAD 'some_file.lzo' USING
com.twitter.elephantbird.pig.store.LzoPigStorage('\\t') AS (col1, col2,
col3, json_data);
extracted = FOREACH initial GENERATE (chararray) json_data#'type' AS type;
dump extracted;

Which gives me the above warning along with:

()
()

I also tried it without the cast to chararray, but received the same
results. Should I be casting json_data as some other data type when I
load it initially? Seems by default it's cast to a bytearray when I
describe initial. Would that be a problem?

Thanks for all the help so far!

Eli

On 9/12/11 9:26 PM, Dmitriy Ryaboy wrote:
> Ah yeah that's my favorite thing about Pig maps (prior to pig 0.9,
> theoretically).
> The values are bytearrays. You are probably trying to treat them as strings.
>   You have to do stuff like this:
>
> x = foreach myrelation generate
>    (chararray) mymap#'foo' as foo,
>    (chararray) mymap#'bar' as bar;
>
>
> On Mon, Sep 12, 2011 at 11:54 AM, Eli Finkelshteyn<[EMAIL PROTECTED]>  wrote:
>
>> Hmmm, now it gets past my mention of the function, but when I run a dump on
>> generated information, I get:
>>
>> 2011-09-12 14:48:12,814 [main] ERROR org.apache.pig.tools.grunt.**Grunt -
>> ERROR 2997: Unable to recreate exception from backed error:
>> java.lang.ClassCastException: *org.apache.pig.data.**DataByteArray cannot
>> be cast to java.lang.String*
>>
>> Thanks for all the help so far!
>>
>> Eli
>>
>>
>> On 9/12/11 2:42 PM, Dmitriy Ryaboy wrote:
>>
>>> You also want json-simple-1.1.jar
>>>
>>>
>>> On Mon, Sep 12, 2011 at 10:46 AM, Eli Finkelshteyn<iefinkel@gmail.**com<[EMAIL PROTECTED]>
>>>> wrote:
>>>   Hmm, I'm loading up hadoop-lzo.*.jar, elephant-bird.*.jar, guava-*.jar,
>>>> and
>>>> piggybank.jar, and then trying to use that UDF, but getting the following
>>>> error:
>>>>
>>>> ERROR 2998: Unhandled internal error. org/json/simple/parser/**
>>>> ParseException
>>>>
>>>> java.lang.****NoClassDefFoundError: org/json/simple/parser/****
>>>> ParseException
>>>>         at java.lang.Class.forName0(****Native Method)
>>>>         at java.lang.Class.forName(Class.****java:247)
>>>>         at org.apache.pig.impl.****PigContext.resolveClassName(**
>>>> PigContext.java:426)
>>>>         at org.apache.pig.impl.****PigContext.****
>>>> instantiateFuncFromSpec(**
>>>> PigContext.java:456)
>>>>         at org.apache.pig.impl.****PigContext.****
>>>> instantiateFuncFromSpec(**
>>>> PigContext.java:508)
>>>>         at org.apache.pig.impl.****PigContext.****
>>>> instantiateFuncFromAlias(**
>>>> PigContext.java:531)
>>>>         at org.apache.pig.impl.****logicalLayer.parser.**
>>>> QueryParser.EvalFuncSpec(****QueryParser.java:5462)
>>>>         at org.apache.pig.impl.****logicalLayer.parser.**
>>>> QueryParser.BaseEvalSpec(****QueryParser.java:5291)
>>>>         at org.apache.pig.impl.****logicalLayer.parser.**
>>>> QueryParser.UnaryExpr(****QueryParser.java:5187)
>>>>         at org.apache.pig.impl.****logicalLayer.parser.**
>>>> QueryParser.CastExpr(****QueryParser.java:5133)
>>>>         at org.apache.pig.impl.****logicalLayer.parser.****QueryParser.**
>>>> MultiplicativeExpr(****QueryParser.java:5042)
>>>>         at org.apache.pig.impl.****logicalLayer.parser.**
>>>> QueryParser.AdditiveExpr(****QueryParser.java:4968)
>>>>         at org.apache.pig.impl.****logicalLayer.parser.**
>>>> QueryParser.InfixExpr(****QueryParser.java:4934)
>>>>         at org.apache.pig.impl.****logicalLayer.parser.****QueryParser.**
>>>> FlattenedGenerateItem(****QueryParser.java:4861)
>>>>         at org.apache.pig.impl.****logicalLayer.parser.****QueryParser.**
>>>> FlattenedGenerateItemList(****QueryParser.java:4747)
>>>