Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - Efficient way to read a large number of files in S3 and upload their content to HBase


Copy link to this message
-
Re: Efficient way to read a large number of files in S3 and upload their content to HBase
Amandeep Khurana 2012-05-24, 18:55
Marcos,

You could to a distcp from S3 to HDFS and then do a bulk import into HBase.

Are you running HBase on EC2 or on your own hardware?

-Amandeep  
On Thursday, May 24, 2012 at 11:52 AM, Marcos Ortiz wrote:

> Regards to all the list.
> We are using Amazon S3 to store millions of files with certain format,  
> and we want to read the content of these files and then upload its  
> content to
> a HBase cluster.
> Anyone has done this?
> Can you recommend me a efficient way to do this?
>  
> Best wishes.
>  
> --  
> Marcos Luis Ortíz Valmaseda
> Data Engineer&& Sr. System Administrator at UCI
> http://marcosluis2186.posterous.com
> http://www.linkedin.com/in/marcosluis2186
> Twitter: @marcosluis2186
>  
>  
> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS INFORMATICAS...
> CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION
>  
> http://www.uci.cu
> http://www.facebook.com/universidad.uci
> http://www.flickr.com/photos/universidad_uci
>  
>