Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Threaded View
Accumulo, mail # user - Table splitting


Copy link to this message
-
Re: Table splitting
ameet kini 2012-12-12, 18:45
On Wed, Dec 12, 2012 at 1:36 PM, Eric Newton <[EMAIL PROTECTED]> wrote:

> No.  If a compaction takes place, it will tend to make a block local again.
>
>

That seems to be enough to ensure that blocks are local to the new host to
which the tablet was migrated.

We do see a fair bit of migrations because of the balancer, and I would do
manual compactions on those ranges to get the blocks local, but I wasn't
sure if its the recommended way.

The LocalityCheck utility is useful, thanks.

Ameet
> We do keep track of the last location in which a file was written so we
> can attempt to put a tablet back there, but that's about all the server
> does to preserve locality.
>
> -Eric
>
>
> On Wed, Dec 12, 2012 at 1:32 PM, ameet kini <[EMAIL PROTECTED]> wrote:
>
>> If its moved for balance or recovery purposes, are there any mechanisms
>> to copy the blocks over to the new location? Would compaction be this
>> mechanism? Or is it automatically done as part of tablet migration?
>>
>>
>> On Wed, Dec 12, 2012 at 1:29 PM, Eric Newton <[EMAIL PROTECTED]>wrote:
>>
>>> Well, you have to assume the tablet does not get moved for balancing or
>>> recovery.
>>>
>>> -Eric
>>>
>>>
>>> On Wed, Dec 12, 2012 at 1:27 PM, ameet kini <[EMAIL PROTECTED]> wrote:
>>>
>>>>
>>>> Ok, so in short, assuming that there's sufficient local disk space, a
>>>> given tablet having all its blocks local relies on HDFS's guarantee that
>>>> the first replica of a block will be local as long as the tablet server is
>>>> also a data node. Yes?
>>>>
>>>>
>>>>
>>>> On Wed, Dec 12, 2012 at 1:18 PM, Eric Newton <[EMAIL PROTECTED]>wrote:
>>>>
>>>>> Check out o.a.a.server.util.LocalityCheck
>>>>>
>>>>> -Eric
>>>>>
>>>>>
>>>>> On Wed, Dec 12, 2012 at 1:17 PM, John Vines <[EMAIL PROTECTED]> wrote:
>>>>>
>>>>>> When a file gets written to hdfs, there is a guarantee the file is
>>>>>> local as long as that systems disks are not full. Accumulo does not have a
>>>>>> locality guarantee as tablets will migrate on occasion. However, as data is
>>>>>> added, major compactions will occur which will restore locality.
>>>>>>  On Dec 12, 2012 1:09 PM, "ameet kini" <[EMAIL PROTECTED]> wrote:
>>>>>>
>>>>>>>
>>>>>>> Along these lines....
>>>>>>>
>>>>>>> Can someone help me understand how tablets map to files on disk in
>>>>>>> HDFS? From what I understand, after a compaction, there may be one (or
>>>>>>> more?) files on HDFS for a given tablet. Each file can consist of multiple
>>>>>>> HDFS blocks. Does Accumulo guarantee that the tablet serving a given data
>>>>>>> range finds all its blocks locally? If so, how does it keep this guarantee?
>>>>>>> Wouldn't HDFS distribute these blocks around based on HDFS balancing
>>>>>>> strategy?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Ameet
>>>>>>>
>>>>>>> On Tue, Dec 11, 2012 at 9:37 AM, William Slacum <
>>>>>>> [EMAIL PROTECTED]> wrote:
>>>>>>>
>>>>>>>> Tablets will split automatically, down to the granularity of a row.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Dec 11, 2012 at 9:32 AM, Mathias Herberts <
>>>>>>>> [EMAIL PROTECTED]> wrote:
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I've read the user manual for v1.4.2 and I have not seen any
>>>>>>>>> mention of automatic tablet splitting. Is there such a thing in Accumulo or
>>>>>>>>> is pre-splitting the only way to split a table?
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Mathias.
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>
>>>>
>>>
>>
>