Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> JsonStorage() fails to write .pig_schema to S3 after correctly writing alias to json files.


Copy link to this message
-
JsonStorage() fails to write .pig_schema to S3 after correctly writing alias to json files.
Hello,

I've been having trouble with JsonStorage(). First, since my Python UDF had
an outputSchema that returned floats, I was getting an error in JsonStorage
trying to cast Double to Float. I resolved this by changing my UDF to
return doubles.

Pig-0.11.1, hadoop-1.0.3.

Next, I am able to successfully write json files out to s3 (I was watching
as my Pig job was running and grabbed a sample) but then at what appears to
be the final step of writing .pig_schema, this error is thrown:

grunt> *STORE firsts INTO 's3n://n2ygk/firsthops.json' using JsonStorage();*
*
*
*... *chugs along for a while successfully writing
s3://n2ygk/firsthops.json/part-r-* into the bucket.... and then:

*java.lang.IllegalArgumentException: This file system object (hdfs://
10.253.44.244:9000) does not support access to the request path
's3n://n2ygk/firsthops.json/.pig_schema' You possibly called
FileSystem.get(conf) when you should have called FileSystem.get(uri, conf)
to obtain a file system supporting your path.*
                                    at
org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:384)
                                    at
org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:129)
                                    at
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:513)
                                    at
org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:770)
                                    at
org.apache.pig.backend.hadoop.datastorage.HDataStorage.isContainer(HDataStorage.java:200)
                                    at
org.apache.pig.backend.hadoop.datastorage.HDataStorage.asElement(HDataStorage.java:128)
                                    at
org.apache.pig.backend.hadoop.datastorage.HDataStorage.asElement(HDataStorage.java:144)
                                    at
org.apache.pig.builtin.JsonMetadata.storeSchema(JsonMetadata.java:294)
                                    at
org.apache.pig.builtin.JsonStorage.storeSchema(JsonStorage.java:274)
                                    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.storeCleanup(PigOutputCommitter.java:141)
                                    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter.commitJob(PigOutputCommitter.java:204)
                                    at
org.apache.hadoop.mapred.Task.runJobCleanupTask(Task.java:1060)
                                    at
org.apache.hadoop.mapred.MapTask.run(MapTask.java:362)
                                    at
org.apache.hadoop.mapred.Child$4.run(Child.java:255)
                                    at
java.security.AccessController.doPrivileged(Native Method)
                                    at
javax.security.auth.Subject.doAs(Subject.java:396)
                                    at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132)
                                    at
org.apache.hadoop.mapred.Child.main(Child.java:249)

Any ideas?

Thanks.
/a
+
Russell Jurney 2013-06-08, 23:41
+
Shahab Yunus 2013-06-08, 23:14
+
Alan Crosswell 2013-06-09, 00:46
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB