Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
HBase, mail # user - EC2 Elastic MapReduce HBase install recommendations


+
Pal Konyves 2013-05-08, 02:01
+
Marcos Luis Ortiz Valmase... 2013-05-08, 02:31
+
ramkrishna vasudevan 2013-05-08, 02:41
+
Marcos Luis Ortiz Valmase... 2013-05-08, 02:42
+
Andrew Purtell 2013-05-09, 04:04
+
Amandeep Khurana 2013-05-09, 04:12
Copy link to this message
-
Re: EC2 Elastic MapReduce HBase install recommendations
Michel Segel 2013-05-09, 04:47
With respect to EMR, you can run HBase fairly easily.
You can't run MapR w HBase on EMR stick w Amazon's release.

And you can run it but you will want to know your tuning parameters up front when you instantiate it.

Sent from a remote device. Please excuse any typos...

Mike Segel

On May 8, 2013, at 9:04 PM, Andrew Purtell <[EMAIL PROTECTED]> wrote:

> M7 is not Apache HBase, or any HBase. It is a proprietary NoSQL datastore
> with (I gather) an Apache HBase compatible Java API.
>
> As for running HBase on EC2, we recently discussed some particulars, see
> the latter part of this thread: http://search-hadoop.com/m/rI1HpK90gu where
> I hijack it. I wouldn't recommend launching HBase as part of an EMR flow
> unless you want to use it only for temporary random access storage, and in
> which case use m2.2xlarge/m2.4xlarge instance types. Otherwise, set up a
> dedicated HBase backed storage service on high I/O instance types. The
> fundamental issue is IO performance on the EC2 platform is fair to poor.
>
> I have also noticed a large difference in baseline block device latency if
> using an old Amazon Linux AMI (< 2013) or the latest AMIs from this year.
> Use the new ones, they cut the latency long tail in half. There were some
> significant kernel level improvements I gather.
>
>
> On Wed, May 8, 2013 at 10:42 AM, Marcos Luis Ortiz Valmaseda <
> [EMAIL PROTECTED]> wrote:
>
>> I think that you when you are talking about RMap, you are referring to
>> MapR´s distribution.
>> I think that MapR´s team released a very good version of its Hadoop
>> distribution focused on HBase called M7. You can see its overview here:
>> http://www.mapr.com/products/mapr-editions/m7-edition
>>
>> But this release was under beta testing, and I see that it´s not included
>> in the Amazon Marketplace yet:
>>
>> https://aws.amazon.com/marketplace/seller-profile?id=802b0a25-877e-4b57-9007-a3fd284815a5
>>
>>
>>
>>
>> 2013/5/7 Pal Konyves <[EMAIL PROTECTED]>
>>
>>> Hi,
>>>
>>> Has anyone got some recommendations about running HBase on EC2? I am
>>> testing it, and so far I am very disappointed with it. I did not change
>>> anything about the default 'Amazon distribution' installation. It has one
>>> MasterNode and two slave nodes, and write performance is around 2500
>> small
>>> rows per sec at most, but I expected it to be way  better. Oh, and this
>> is
>>> with batch put operations with autocommit turned off, where each batch
>>> containes about 500-1000 rows... When I do it with autocommit, it does
>> not
>>> even reach the 1000 rows per sec.
>>>
>>> Every nodes were m1.Large ones.
>>>
>>> Any experiences, suggestions? Is it worth to try the RMap distribution
>>> instead of the amazon one?
>>>
>>> Thanks,
>>> Pal
>>
>>
>>
>> --
>> Marcos Ortiz Valmaseda
>> Product Manager at PDVSA
>> http://about.me/marcosortiz
>
>
>
> --
> Best regards,
>
>   - Andy
>
> Problems worthy of attack prove their worth by hitting back. - Piet Hein
> (via Tom White)
+
Pal Konyves 2013-05-09, 09:39
+
Michel Segel 2013-05-09, 12:32
+
Pal Konyves 2013-05-12, 02:14
+
Ted Yu 2013-05-12, 02:25
+
Asaf Mesika 2013-05-12, 05:13