Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Pig >> mail # user >> pig script - failed reading input from s3


+
Panshul Whisper 2013-04-07, 17:11
Copy link to this message
-
Re: pig script - failed reading input from s3
Try
fs.s3n.aws…

and also load from s3
data = load 's3n://...'

The "n" stands for native. I believe S3 also supports block device storage (s3://) which allows bigger files to be stored. I don't know how (if at all) the two types interact.

David

On Apr 7, 2013, at 1:11 PM, Panshul Whisper <[EMAIL PROTECTED]> wrote:

> Hello
>
> I am trying to run a pig script which is suppoesed to read input from s3
> and write back to s3. The cluster
> scenario is as follows:
> * Cluster is installed on EC2 using Cloudera Manager 4.5 Automatic
> Installation
> * Installed version: CDH4
> * Script location on - one of the nodes of cluster
> * running as : $ pig countGroups_daily.pig
>
> *The Pig Script*:
> set fs.s3.awsAccessKeyId xxxxxxxxxxxxxxxxxx
> set fs.s3.awsSecretAccessKey xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
> --load the sample input file
> data = load 's3://steamdata/nysedata/NYSE_daily.txt' as
> (exchange:chararray, symbol:chararray, date:chararray, open:float,
> high:float, low:float, close:float, volume:int, adj_close:float);
> --group data by symbols
> symbolgrp = group data by symbol;
> --count data in every group
> symcount = foreach symbolgrp generate group,COUNT(data);
> --order the counted list by count
> symcountordered = order symcount by $1;
> store symcountordered into 's3://steamdata/nyseoutput/daily';
>
> *Error:*
>
> Message: org.apache.pig.backend.executionengine.ExecException: ERROR 2118:
> Input path does not exist: s3://steamdata/nysedata/NYSE_daily.txt
>
> Input(s):
> Failed to read data from "s3://steamdata/nysedata/NYSE_daily.txt"
>
> Please help me, what am I doing wrong. I can assure you that the input
> path/file exists on s3 and the AWS key and secret key entered are correct.
>
> Thanking You,
>
>
> --
> Regards,
> Ouch Whisper
> 010101010101
+
Panshul Whisper 2013-04-08, 15:30
+
Vitalii Tymchyshyn 2013-04-09, 07:09
+
Panshul Whisper 2013-04-10, 10:00
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB