Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> pig script - failed reading input from s3


Copy link to this message
-
pig script - failed reading input from s3
Hello

I am trying to run a pig script which is suppoesed to read input from s3
and write back to s3. The cluster
scenario is as follows:
* Cluster is installed on EC2 using Cloudera Manager 4.5 Automatic
Installation
* Installed version: CDH4
* Script location on - one of the nodes of cluster
* running as : $ pig countGroups_daily.pig

*The Pig Script*:
set fs.s3.awsAccessKeyId xxxxxxxxxxxxxxxxxx
set fs.s3.awsSecretAccessKey xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
--load the sample input file
data = load 's3://steamdata/nysedata/NYSE_daily.txt' as
(exchange:chararray, symbol:chararray, date:chararray, open:float,
high:float, low:float, close:float, volume:int, adj_close:float);
--group data by symbols
symbolgrp = group data by symbol;
--count data in every group
symcount = foreach symbolgrp generate group,COUNT(data);
--order the counted list by count
symcountordered = order symcount by $1;
store symcountordered into 's3://steamdata/nyseoutput/daily';

*Error:*

Message: org.apache.pig.backend.executionengine.ExecException: ERROR 2118:
Input path does not exist: s3://steamdata/nysedata/NYSE_daily.txt

Input(s):
Failed to read data from "s3://steamdata/nysedata/NYSE_daily.txt"

Please help me, what am I doing wrong. I can assure you that the input
path/file exists on s3 and the AWS key and secret key entered are correct.

Thanking You,
--
Regards,
Ouch Whisper
010101010101
+
David LaBarbera 2013-04-08, 13:27
+
Panshul Whisper 2013-04-08, 15:30
+
Vitalii Tymchyshyn 2013-04-09, 07:09
+
Panshul Whisper 2013-04-10, 10:00
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB