Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - Loading LZOs With Some JSON


Copy link to this message
-
Re: Loading LZOs With Some JSON
Eli Finkelshteyn 2011-09-12, 17:46
Hmm, I'm loading up hadoop-lzo.*.jar, elephant-bird.*.jar, guava-*.jar,
and piggybank.jar, and then trying to use that UDF, but getting the
following error:

ERROR 2998: Unhandled internal error. org/json/simple/parser/ParseException

java.lang.NoClassDefFoundError: org/json/simple/parser/ParseException
         at java.lang.Class.forName0(Native Method)
         at java.lang.Class.forName(Class.java:247)
         at
org.apache.pig.impl.PigContext.resolveClassName(PigContext.java:426)
         at
org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:456)
         at
org.apache.pig.impl.PigContext.instantiateFuncFromSpec(PigContext.java:508)
         at
org.apache.pig.impl.PigContext.instantiateFuncFromAlias(PigContext.java:531)
         at
org.apache.pig.impl.logicalLayer.parser.QueryParser.EvalFuncSpec(QueryParser.java:5462)
         at
org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseEvalSpec(QueryParser.java:5291)
         at
org.apache.pig.impl.logicalLayer.parser.QueryParser.UnaryExpr(QueryParser.java:5187)
         at
org.apache.pig.impl.logicalLayer.parser.QueryParser.CastExpr(QueryParser.java:5133)
         at
org.apache.pig.impl.logicalLayer.parser.QueryParser.MultiplicativeExpr(QueryParser.java:5042)
         at
org.apache.pig.impl.logicalLayer.parser.QueryParser.AdditiveExpr(QueryParser.java:4968)
         at
org.apache.pig.impl.logicalLayer.parser.QueryParser.InfixExpr(QueryParser.java:4934)
         at
org.apache.pig.impl.logicalLayer.parser.QueryParser.FlattenedGenerateItem(QueryParser.java:4861)
         at
org.apache.pig.impl.logicalLayer.parser.QueryParser.FlattenedGenerateItemList(QueryParser.java:4747)
         at
org.apache.pig.impl.logicalLayer.parser.QueryParser.GenerateStatement(QueryParser.java:4704)
         at
org.apache.pig.impl.logicalLayer.parser.QueryParser.NestedBlock(QueryParser.java:4030)
         at
org.apache.pig.impl.logicalLayer.parser.QueryParser.ForEachClause(QueryParser.java:3433)
         at
org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:1464)
         at
org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:1013)
         at
org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:800)
         etc...

Any ideas? I've verified that it recognizes the function itself, and
that the data it's running on is valid json. Not sure what else I can check.

Eli
On 9/9/11 7:13 PM, Dmitriy Ryaboy wrote:
> They derive from the same classes as far as lzo handling goes, so I suspect
> something's up with your environment or inputs if you get LzoTokenizedLoader
> to work, but LzoJsonStorage does not.
>
> Note that LzoTokenizedLoader is deprecated -- just use LzoPigStorage.
>
> JsonLoader wouldn't work for you because it expects the complete input line
> to be json, not part of it. You want to load with LzoPigStorage, and then
> apply the JsonStringToMap udf to the third field.
>
> -D
>
>
> On Fri, Sep 9, 2011 at 3:49 PM, Eli Finkelshteyn<[EMAIL PROTECTED]>  wrote:
>
>> Hi,
>> I'm currently working on trying to load lzos that contain some JSON
>> elements. This is of the form:
>>
>> item1    item2    {'thing1':'1','thing2':'2'}
>> item3    item4    {'thing3':'1','thing27':'2'}
>> item5    item6    {'thing5':'1','thing19':'2'}
>>
>> I was thinking I could use LzoJsonLoader for this, but it keeps throwing me
>> errors like:
>> ERROR com.hadoop.compression.lzo.**LzoCodec - Cannot load native-lzo
>> without native-hadoop
>>
>> This is despite the fact that I can load normal lzos just fine using
>> LzoTokenizedLoader('\\t'). So, now I'm at a bit of a standstill. What should
>> I do to go about loading these files? Does anyone have any ideas?
>>
>> Cheers,
>> Eli
>>