Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> elephantbird JsonLoader doesn't like gz?


Copy link to this message
-
elephantbird JsonLoader doesn't like gz?
Hi,

Anyone using Twitter's elephantbird library? I was using its JsonLoader and
got this error:

WARN  com.twitter.elephantbird.pig.load.JsonLoader - Could not json-decode
string
Unexpected character () at position 0.
at org.json.simple.parser.Yylex.yylex(Unknown Source)
at org.json.simple.parser.JSONParser.nextToken(Unknown Source)
 at org.json.simple.parser.JSONParser.parse(Unknown Source)
at org.json.simple.parser.JSONParser.parse(Unknown Source)

But if I manually gunzip the file to a clear text json file, JsonLoader
works fine.

Again this fails:

raw_json = LOAD 'cc.json.gz' USING
com.twitter.elephantbird.pig.load.JsonLoader();

this works:

$ gunzip cc.json.gz
raw_json = LOAD 'cc.json' USING
com.twitter.elephantbird.pig.load.JsonLoader();

Any suggestions for this? Or is there any other json loader library out
there? I can write my own but would rather use one if already exists.

Thanks,

Dexin
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB