On Thu, Mar 21, 2013 at 1:44 PM, Brennon Church <[EMAIL PROTECTED]> wrote:
> Here's the data locality index values for all 8 nodes:
> Those seem pretty bad to me.
Yeah, considering that you have 8 nodes and probably use a replication
factor of 3, then I would expect you to be at least 38% local in case
of a wrongful restart (but then minor compactions probably ran and
that brought you up).
> I'm running HBase v. 0.92.0
> I'd considered the async problem, and was going to add some basic checks
> into the script to not submit additional compactions to the queue if I saw
> that it had anything in it already.
> For the moment, it seems my best bet is to run through the major compactions
> for everything to regain locality. Going forward, we may or may not need
> the major compactions on a regular basis. I can tell you it's been several
> months since we turned them off, and performance has been reasonable.
FWIW your data should be cached now so major compacting will do no
good (unless you mostly do full table scans, in which case the caching
doesn't do anything for you).
You shouldn't see a big difference turning major compactions off if
you don't delete/update a lot.
> On 3/21/13 10:49 AM, Jean-Daniel Cryans wrote:
>> On Thu, Mar 21, 2013 at 6:46 AM, Brennon Church <[EMAIL PROTECTED]>
>>> Hello all,
>>> As I understand it, a common performance tweak is to disable major
>>> compactions so that you don't end up with storms taking things out at
>>> inconvenient times. I'm thinking that I should just write a quick script
>>> rotate through all of our regions, one at a time, and compact them.
>>> if I'm understanding this correctly we should not end up with storms as
>>> they'll only happen one at a time, and each one doesn't run for long.
>>> that seem reasonable, or am I missing something? My hope is to run the
>>> script regularly.
>> FWIW major compacting isn't even needed if you don't update or delete
>> cells so do consider that too.
>> The problem with scheduling major compactions yourself is that, since
>> the command is async, you can still end up with a storm of compactions
>> if you just blindly issue major_compact for all your regions. Things
>> like adding wait time works but then let's say you want the
>> compactions to run only between 2 and 4AM then you can run out of
>> time. What I have seen to circumvent this is to only do a subset of
>> the regions at a time. You can also use JMX to monitor the compaction
>> queue on each RS and make sure you are not just piling them up, but
>> this requires some more work.
>>> Corollary question... I recently added drives to our nodes and since I
>>> this while they were all still running, basically just restarting the
>>> datanode underneath to pick up the new spindles, I'm fairly sure I've
>>> data locality out the window, based on the changed pattern of network
>> Interesting but unlikely. Even restarting HBase shouldn't do that
>> unless it was wrongly restarted. Each RS publishes a locality index
>> (hdfsBlocksLocalityIndex) that you can find via JMX or in their web
>> UI, are they close to 100% or way down? Also which version are you on?
>>> If I'm right, manually running major compactions against all of
>>> the regions should resolve that, as the underlying data would all get
>>> written locally. Again, does that make sense?
>> Major compacting would do that yes, but first check if you need it at
>> all I think.