Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase >> mail # user >> Re: Efficient way to read a large number of files in S3 and upload their content to HBase


+
Amandeep Khurana 2012-05-24, 19:21
+
Marcos Ortiz 2012-05-24, 19:52
+
Amandeep Khurana 2012-05-24, 20:47
+
Marcos Ortiz 2012-05-24, 20:53
+
Ian Varley 2012-05-24, 21:12
+
Marcos Ortiz 2012-05-24, 22:33
+
Marcos Ortiz Valmaseda 2012-05-30, 15:56
+
Marcos Ortiz 2012-05-24, 18:52
+
Amandeep Khurana 2012-05-24, 18:55
Copy link to this message
-
Re: Efficient way to read a large number of files in S3 and upload their content to HBase
Thanks a lot for your answer, Amandeep.

On 05/24/2012 02:55 PM, Amandeep Khurana wrote:
> Marcos,
>
> You could to a distcp from S3 to HDFS and then do a bulk import into HBase.
The quantity of files are very large, so, we want to combine some files,
and then construct
the HFile to upload to HBase.
Any example of a custom FileMerger for it?
>
> Are you running HBase on EC2 or on your own hardware?
We have created a small HBase in our own hardware, but we want to build
another cluster on top of Amazon EC2. This
could be very good for the integration between S3 and the HBase cluster.

Regards
>
> -Amandeep
>
>
> On Thursday, May 24, 2012 at 11:52 AM, Marcos Ortiz wrote:
>
>> Regards to all the list.
>> We are using Amazon S3 to store millions of files with certain format,
>> and we want to read the content of these files and then upload its
>> content to
>> a HBase cluster.
>> Anyone has done this?
>> Can you recommend me a efficient way to do this?
>>
>> Best wishes.
>>
>> --
>> Marcos Luis Ortíz Valmaseda
>> Data Engineer&&  Sr. System Administrator at UCI
>> http://marcosluis2186.posterous.com
>> http://www.linkedin.com/in/marcosluis2186
>> Twitter: @marcosluis2186
>>
>>
>> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS INFORMATICAS...
>> CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION
>>
>> http://www.uci.cu
>> http://www.facebook.com/universidad.uci
>> http://www.flickr.com/photos/universidad_uci
>>
>>
>
>
>
> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS INFORMATICAS...
> CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION
>
> http://www.uci.cu
> http://www.facebook.com/universidad.uci
> http://www.flickr.com/photos/universidad_uci
>
>

--
Marcos Luis Ortíz Valmaseda
  Data Engineer&&  Sr. System Administrator at UCI
  http://marcosluis2186.posterous.com
  http://www.linkedin.com/in/marcosluis2186
  Twitter: @marcosluis2186
10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS INFORMATICAS...
CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION

http://www.uci.cu
http://www.facebook.com/universidad.uci
http://www.flickr.com/photos/universidad_uci