Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Hive >> mail # user >> Using data from S3 on local cluster


Copy link to this message
-
Using data from S3 on local cluster
Hi,

 

I do have an urgent problem with the Hive server in Java, reading data from
Amazon S3.

 

I try to read data from a table which content is stored on Amazon S3 space
and process it with a local cluster.

 

This data is accessed setting the following value of the configuration of
the server, when creating it in Java:

 

fs.defaultFS=s3://my-bucket

 

Commands using no Hadoop job to perform (like "SELECT * FROM testable") do
work that way. But when adding a where clause, so that a Hadoop job is
created on the local cluster, it fails immediately after initialization.

I suppose this is because the server tries to use the S3 space to perform
the job or create job information.

 

How can I solve this?

 

Thanks

Tim

NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB