Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Threaded View
Pig >> mail # user >> Copying files to Amazon S3 using Pig is slow


Copy link to this message
-
Re: Copying files to Amazon S3 using Pig is slow
Also use multiple streams of s3 to get better throughput

On Fri, Jun 8, 2012 at 3:24 PM, Aniket Mokashi <[EMAIL PROTECTED]> wrote:

>
> http://docs.amazonwebservices.com/ElasticMapReduce/latest/DeveloperGuide/UsingEMR_s3distcp.html
>
> On Fri, Jun 8, 2012 at 4:40 AM, James Newhaven <[EMAIL PROTECTED]
> >wrote:
>
> > I want to copy 26,000 HDFS files generated by a pig script to Amazon S3.
> >
> > I am using the copyToLocal command, but I noticed the copy throughput is
> > only one file per second - so it is going to take about 7 hours to copy
> all
> > the files.
> >
> > The command I am using is: copyToLocal /tmp/files/ s3://my-bucket/
> >
> > Does anyone have any ideas how I could speed this up?
> >
> > Thanks,
> > James
> >
>
>
>
> --
> "...:::Aniket:::... Quetzalco@tl"
>
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB