Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Copying files to Amazon S3 using Pig is slow


Copy link to this message
-
Re: Copying files to Amazon S3 using Pig is slow
Also use multiple streams of s3 to get better throughput

On Fri, Jun 8, 2012 at 3:24 PM, Aniket Mokashi <[EMAIL PROTECTED]> wrote:

>
> http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/UsingEMR_s3distcp.html
>
> On Fri, Jun 8, 2012 at 4:40 AM, James Newhaven <[EMAIL PROTECTED]
> >wrote:
>
> > I want to copy 26,000 HDFS files generated by a pig script to Amazon S3.
> >
> > I am using the copyToLocal command, but I noticed the copy throughput is
> > only one file per second - so it is going to take about 7 hours to copy
> all
> > the files.
> >
> > The command I am using is: copyToLocal /tmp/files/ s3://my-bucket/
> >
> > Does anyone have any ideas how I could speed this up?
> >
> > Thanks,
> > James
> >
>
>
>
> --
> "...:::Aniket:::... Quetzalco@tl"
>