Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
HBase >> mail # user >> hash function per table


Copy link to this message
-
Re: hash function per table
What's the performance penalty  when scanning with row prefix filter instead
of with start/end key ?
Can it still work (in reasonable processing time) when the table contains
billions of records ?
On Sun, Mar 20, 2011 at 10:03 PM, Pete Haidinyak <[EMAIL PROTECTED]> wrote:

> I went through this discussion a month or so ago and came away with the
> opinion that you can either have an efficient load with random key but then
> have an inefficient 'scan' not using start and end rows, or have an
> inefficient import with sequential key and then scan using start and end
> rows.
>
> -Pete
>
>
>
> On Sun, 20 Mar 2011 12:52:24 -0700, Oleg Ruchovets <[EMAIL PROTECTED]>
> wrote:
>
>  Actually discussion started from this post:
>>
>>
>>
>> http://search-hadoop.com/m/XX3nW68JsY1/hbase+insertion+optimisation&subj=hbase+insertion+optimisation+
>>
>> Simply inserting the data in which row key <date>_<somedata> I noticed
>> that
>> only one node works (region to which data were writing). In case we have
>> 10-15 nodes I think it is inefficient to write data to only one region. I
>> want to get an effect that data will be inserted to  as much as possible
>> nodes  simultaneously. Correct me guys ,  but in this case  writing job
>> will take less time , am I write?
>>
>> Oleg.
>>
>> On Sun, Mar 20, 2011 at 8:57 PM, Chris Tarnas <[EMAIL PROTECTED]> wrote:
>>
>>  There is none - HBase uses a total order partitioner. The straight key
>>> value itself determines which region a row is put into. This allows for
>>> very
>>> rapid scans of sequential data, among other things but does mean it is
>>> easier to hotspot regions. Key design is very important.
>>>
>>> -chris
>>>
>>> On Mar 20, 2011, at 11:41 AM, Lior Schachter wrote:
>>>
>>> > the hash function that distributes the rows between the regions.
>>> >
>>> > On Sun, Mar 20, 2011 at 8:36 PM, Stack <[EMAIL PROTECTED]> wrote:
>>> >
>>> >> Hash?  Which hash are you referring to sir?
>>> >> St.Ack
>>> >>
>>> >> On Sun, Mar 20, 2011 at 10:06 AM, Lior Schachter <[EMAIL PROTECTED]
>>> >
>>> >> wrote:
>>> >>> Hi,
>>> >>> What is the API or configuration for changing the default hash
>>> function
>>> >> for
>>> >>> a specific htable.
>>> >>>
>>> >>> thanks,
>>> >>> Lior
>>> >>>
>>> >>
>>>
>>>
>