|
|
-
Time requirement between shutting down tablet servers?
Steven Troxell 2012-08-06, 14:25
Is there a problem with shutting down tablet servers in quick succession? I am attempting to scale back from 10 tservers to 2 for benchmark testing, but I am running into problems where the at some point, the monitor stops showing the remaining servers (that I hadn't gotten to kill yet) as online. I see numerous Connection refused, and unable to recover errors in my logs, but there's no consistency as to after how many servers shut down that I lose everythying. The only thing I've picked up on is higher success rates, when I leave larger gaps of time in between shutting servers. Is this reasonable/expected behavior?
I am using the bin/stop-here.sh command to kill servers. Alternatively I have tried ./bin/stop-all.sh, then running ./bin/start-here.sh on master and individual tablet servers I want running, but that doesn't seem to bring them up
Thanks, Steve
-
Re: Time requirement between shutting down tablet servers?
Eric Newton 2012-08-06, 16:28
You are killing loggers, which means that recovery cannot take place with tablets are moved to the remaining servers.
Try:
$ ./bin/accumulo admin stop host:port
This will gracefully stop the tserver and logger on that machine, and flush the tablets with references to logs on that machine.
-Eric
On Mon, Aug 6, 2012 at 10:25 AM, Steven Troxell <[EMAIL PROTECTED]>wrote:
> Is there a problem with shutting down tablet servers in quick succession? > I am attempting to scale back from 10 tservers to 2 for benchmark testing, > but I am running into problems where the at some point, the monitor stops > showing the remaining servers (that I hadn't gotten to kill yet) as > online. I see numerous Connection refused, and unable to recover errors in > my logs, but there's no consistency as to after how many servers shut down > that I lose everythying. The only thing I've picked up on is higher > success rates, when I leave larger gaps of time in between shutting > servers. Is this reasonable/expected behavior? > > I am using the bin/stop-here.sh command to kill servers. Alternatively I > have tried ./bin/stop-all.sh, then running ./bin/start-here.sh on master > and individual tablet servers I want running, but that doesn't seem to > bring them up > > Thanks, > Steve >
-
Re: Time requirement between shutting down tablet servers?
Steven Troxell 2012-08-06, 16:56
Thanks Eric,
I think that's the command I was trying to recall that Adam gave me initially, it looks familiar anyway. I don't remember where/why I switched from using that to stop-here.sh
On Mon, Aug 6, 2012 at 12:28 PM, Eric Newton <[EMAIL PROTECTED]> wrote:
> You are killing loggers, which means that recovery cannot take place with > tablets are moved to the remaining servers. > > Try: > > $ ./bin/accumulo admin stop host:port > > This will gracefully stop the tserver and logger on that machine, and > flush the tablets with references to logs on that machine. > > -Eric > > > On Mon, Aug 6, 2012 at 10:25 AM, Steven Troxell <[EMAIL PROTECTED]>wrote: > >> Is there a problem with shutting down tablet servers in quick >> succession? I am attempting to scale back from 10 tservers to 2 for >> benchmark testing, but I am running into problems where the at some point, >> the monitor stops showing the remaining servers (that I hadn't gotten to >> kill yet) as online. I see numerous Connection refused, and unable to >> recover errors in my logs, but there's no consistency as to after how many >> servers shut down that I lose everythying. The only thing I've picked up >> on is higher success rates, when I leave larger gaps of time in between >> shutting servers. Is this reasonable/expected behavior? >> >> I am using the bin/stop-here.sh command to kill servers. Alternatively I >> have tried ./bin/stop-all.sh, then running ./bin/start-here.sh on master >> and individual tablet servers I want running, but that doesn't seem to >> bring them up >> >> Thanks, >> Steve >> > >
-
Re: Time requirement between shutting down tablet servers?
John Vines 2012-08-06, 17:01
Perhaps we should direct stop-here.sh to utilize admin stop. Or at the very least rename stop-here to kill-here to make it clear that it's rough around the edges.
John
On Mon, Aug 6, 2012 at 12:28 PM, Eric Newton <[EMAIL PROTECTED]> wrote:
> You are killing loggers, which means that recovery cannot take place with > tablets are moved to the remaining servers. > > Try: > > $ ./bin/accumulo admin stop host:port > > This will gracefully stop the tserver and logger on that machine, and > flush the tablets with references to logs on that machine. > > -Eric > > > On Mon, Aug 6, 2012 at 10:25 AM, Steven Troxell <[EMAIL PROTECTED]>wrote: > >> Is there a problem with shutting down tablet servers in quick >> succession? I am attempting to scale back from 10 tservers to 2 for >> benchmark testing, but I am running into problems where the at some point, >> the monitor stops showing the remaining servers (that I hadn't gotten to >> kill yet) as online. I see numerous Connection refused, and unable to >> recover errors in my logs, but there's no consistency as to after how many >> servers shut down that I lose everythying. The only thing I've picked up >> on is higher success rates, when I leave larger gaps of time in between >> shutting servers. Is this reasonable/expected behavior? >> >> I am using the bin/stop-here.sh command to kill servers. Alternatively I >> have tried ./bin/stop-all.sh, then running ./bin/start-here.sh on master >> and individual tablet servers I want running, but that doesn't seem to >> bring them up >> >> Thanks, >> Steve >> > >
-
Re: Time requirement between shutting down tablet servers?
David Medinets 2012-08-06, 17:14
On Mon, Aug 6, 2012 at 1:01 PM, John Vines <[EMAIL PROTECTED]> wrote: > Perhaps we should direct stop-here.sh to utilize admin stop. Or at the very > least rename stop-here to kill-here to make it clear that it's rough around > the edges.
I like the name change that indicates intent.
|
|