Home | About | Sematext search-lucene.com search-hadoop.com
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB
 Search Hadoop and all its subprojects:

Switch to Plain View
Accumulo >> mail # user >> RE: [External]  Re: locked fate threads


+
Losco, Jason [USA] 2013-09-05, 13:27
Copy link to this message
-
Re: [External] Re: locked fate threads
stop-all probably won't work.  I'm suggesting a cluster-wide kill of all
tablet servers:

$ pssh -h conf/slaves pkill -f =tserve[r]   # <--- requires parallel ssh to
be installed

On the master host:

$ pkill -f =master

Wait for the master lock to expire (typically 30 seconds), and kill all the
fate transactions:

$ ./bin/accumulo org.apache.accumulo.server.fate.Admin kill "<txid>"

Then do a start-all and cross your fingers. :-)

-Eric
On Thu, Sep 5, 2013 at 9:27 AM, Losco, Jason [USA] <[EMAIL PROTECTED]>wrote:

>  Thanks for the quick response.  I issued the command to take those
> offline, however, they were locked up due to the other threads so it didn’t
> take.  How do I go about deleting those fate transactions?  Fate delete and
> fate fail do not work from the shell.  Are you suggesting a stop-all of
> accumulo, then running something using the actual AdminUtil class to kill
> those transactions?  Any input into how to kick off that process would be
> greatly appreciated.****
>
> ** **
>
> losco****
>
> ** **
>
> *From:* Eric Newton [mailto:[EMAIL PROTECTED]]
> *Sent:* Thursday, September 05, 2013 9:18 AM
> *To:* [EMAIL PROTECTED]
> *Subject:* [External] Re: locked fate threads****
>
> ** **
>
> I can't believe I posted a note about using deletemany on the !METADATA
> table!  That was pretty reckless of me.****
>
> ** **
>
> If you really deleted your table data doing this, and your table was
> online at the time, you need to restart your cluster.****
>
> ** **
>
> That alone might fix the problem.  Otherwise, you are going to need to
> kill the master, delete the fate transactions, restart the master, and
> properly delete the tables.****
>
> ** **
>
> -Eric****
>
> ** **
>
> On Thu, Sep 5, 2013 at 8:00 AM, Losco, Jason [USA] <[EMAIL PROTECTED]>
> wrote:****
>
> I recently tried to remove some tables, during which I was getting a shell
> thread stuck on IO error.  A fate print plus some digging into the logs
> revealed they were stuck waiting on WAL resources.  I found a thread in
> which Eric Newton explained how to manually remove the tables removing
> lines from the !METADATA table using “deletemany –c file,” then cleaning up
> the /accumulo/tables/<id> in hdfs.  I’ve done that, however the fate
> threads are still locked and I am unable to delete or fail them.
> Additionally, the tables I removed from !METADATA and hdfs still appear in
> the list returned by the “tables” command in shell.  Below is the result of
> a “fate print.”  To note, tables id a and b are the two which I’ve removed.
> ****
>
>  ****
>
> test@c4s> fate print****
>
> txid: 4136e024209602eb  status: IN_PROGRESS         op: ChangeTableState
> locked: []              locking: [W:b]           top: ChangeTableState****
>
> txid: 439193592e93e230  status: IN_PROGRESS         op: TableRangeOp
> locked: []              locking: [W:b]           top: TableRangeOp****
>
> txid: 1576dca47dfa2c65  status: IN_PROGRESS         op: TableRangeOp
> locked: []              locking: [W:b]           top: TableRangeOp****
>
> txid: 3ee6232db200f2c7  status: IN_PROGRESS         op: TableRangeOp
> locked: []              locking: [W:b]           top: TableRangeOp****
>
> txid: 19e5d3349679ff6e  status: IN_PROGRESS         op: TableRangeOp
> locked: [W:a]           locking: []              top: TableRangeOpWait****
>
> txid: 29204be9d141dc88  status: IN_PROGRESS         op: TableRangeOp
> locked: []              locking: [W:b]           top: TableRangeOp****
>
> txid: 7d07c50ceb5ac487  status: IN_PROGRESS         op: DeleteTable
> locked: []              locking: [W:b]           top: DeleteTable****
>
> txid: 72895b4b1a5a1640  status: IN_PROGRESS         op: DeleteTable
> locked: []              locking: [W:b]           top: DeleteTable****
>
> txid: 6902bcb06c4f5ae7  status: IN_PROGRESS         op: DeleteTable
> locked: []              locking: [W:b]           top: DeleteTable****
>
> txid: 08db2316eb783ba1  status: IN_PROGRESS         op: TableRangeOp
NEW: Monitor These Apps!
elasticsearch, apache solr, apache hbase, hadoop, redis, casssandra, amazon cloudwatch, mysql, memcached, apache kafka, apache zookeeper, apache storm, ubuntu, centOS, red hat, debian, puppet labs, java, senseiDB