Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase, mail # user - Storing extremely large size file

Copy link to this message
Re: Storing extremely large size file
Michel Segel 2012-04-18, 11:57
Look, I don't want to be *that* guy, but just my $0.02 cents ...

I don't disagree that this topic doesn't come up all the time.  
But you have a couple of issues. What do you consider to be large?
1kb? 10? 100? >1MB?
(and then there's that lie that size doesn't matter... But let's not go there... ;-)

Then there is the issue of region size, number of regions per RS...
(this question alone yields different answers from different people, meaning there isn't a single right answer or even a meaningful one.)

Then how about the heap size for the RS?
How about how much memory should have for your DNs...

You then Run in to some one saying that they read that they could build a DN with X hardware and yet this chapter now says you need to have Y and what's up with that...

Then you have the issue of alternatives like storing the blob in a sequence file while you store the index in HBase...
Just saying... :-)

Sent from a remote device. Please excuse any typos...

Mike Segel

On Apr 18, 2012, at 12:02 AM, lars hofhansl <[EMAIL PROTECTED]> wrote:

> I disagree. This comes up frequently and some basic guidelines should be documented in the Reference Guide.
> If it is indeed not difficult than the section is the book will be short.
> ----- Original Message -----
> From: Michael Segel <[EMAIL PROTECTED]>
> Sent: Tuesday, April 17, 2012 3:43 PM
> Subject: Re: Storing extremely large size file
> -1. It's a boring topic.
> And it's one of those things that you either get it right or you end up hiring a voodoo witch doctor to curse the author of the chapter...
> I agree w Jack, it's not difficult just takes some planning and forethought.
> Also reading lots of blogs... And some practice...
> Sent from my iPhone
> On Apr 17, 2012, at 1:42 PM, "Dave Revell" <[EMAIL PROTECTED]> wrote:
>> +1 Jack :)
>> On Tue, Apr 17, 2012 at 11:38 AM, Stack <[EMAIL PROTECTED]> wrote:
>>> On Tue, Apr 17, 2012 at 11:18 AM, Dave Revell <[EMAIL PROTECTED]>
>>> wrote:
>>>> I think this is a popular topic that might deserve a section in The Book.
>>>> By "this topic" I mean storing big binary chunks.
>>> Get Jack Levin to write it (smile).
>>> And make sure the values are compressed that you send over from the
>>> client....
>>> St.Ack