Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig, mail # user - elephantbird JsonLoader doesn't like gz?


Copy link to this message
-
elephantbird JsonLoader doesn't like gz?
Dexin Wang 2011-05-18, 18:12
Hi,

Anyone using Twitter's elephantbird library? I was using its JsonLoader and
got this error:

WARN  com.twitter.elephantbird.pig.load.JsonLoader - Could not json-decode
string
Unexpected character () at position 0.
at org.json.simple.parser.Yylex.yylex(Unknown Source)
at org.json.simple.parser.JSONParser.nextToken(Unknown Source)
 at org.json.simple.parser.JSONParser.parse(Unknown Source)
at org.json.simple.parser.JSONParser.parse(Unknown Source)

But if I manually gunzip the file to a clear text json file, JsonLoader
works fine.

Again this fails:

raw_json = LOAD 'cc.json.gz' USING
com.twitter.elephantbird.pig.load.JsonLoader();

this works:

$ gunzip cc.json.gz
raw_json = LOAD 'cc.json' USING
com.twitter.elephantbird.pig.load.JsonLoader();

Any suggestions for this? Or is there any other json loader library out
there? I can write my own but would rather use one if already exists.

Thanks,

Dexin