Home | About | Sematext search-lucene.com search-hadoop.com
 Search Hadoop and all its subprojects:

Switch to Plain View
Accumulo, mail # user - Table deletion got stuck


+
Lin XIAO 2012-11-27, 21:28
+
Keith Turner 2012-11-27, 21:38
+
Lin XIAO 2012-11-27, 21:42
+
Keith Turner 2012-11-27, 22:22
+
John Vines 2012-11-27, 22:24
+
Lin XIAO 2012-11-27, 23:20
+
Keith Turner 2012-11-28, 13:55
+
Lin XIAO 2012-11-28, 15:44
+
Keith Turner 2012-11-28, 15:55
+
Lin XIAO 2012-11-28, 16:07
+
Keith Turner 2012-11-28, 16:20
+
Lin XIAO 2012-11-28, 19:22
+
Keith Turner 2012-11-29, 18:16
+
Keith Turner 2012-11-28, 16:08
Copy link to this message
-
Re: Table deletion got stuck
Lin XIAO 2012-11-28, 19:05
The tserver started on November 20:
20 15:13:14,328 [client.ZooKeeperInstance] DEBUG: Trying to read
instance id from /table/accumulo/instance_id
20 15:13:14,328 [server.Accumulo] INFO : Instance
c8e02396-a69f-48be-aec2-045bbc55fa0c

There was no FATAL log message.

On Wed, Nov 28, 2012 at 11:08 AM, Keith Turner <[EMAIL PROTECTED]> wrote:
> 27 11:48:04,332 [tableOps.CleanUp] DEBUG: Still waiting for table to be
> deleted: n8 locationState: n8<<@(null,10.0.0.10:41000[43b1b039a081368],null)
>
> The delete code waits for all tablets related to a table to unload.  It will
> wait for tablets that are assigned or hosted.  An assigned tablet is one
> that the master has asked a tablet server to load, but the tablet server has
> not yet loaded.   A hosted tablet is loaded.
>
> The debug message seems to indicate n8<< was hosted.   If it were assigned,
> but not hosted I think it would print
> n8<<@(10.0.0.10:41000[43b1b039a081368],null,null) instead of
> @(null,10.0.0.10:41000[43b1b039a081368],null).
>
> So the master thought the tablet was loaded, even though the tablet load
> time was later.
>
> What times did the tablet server start?  You can grep tablet server logs for
> "Instance".  The Accumulo instance id is logged when a tablet server starts.
> Also, are there any FATAL messages in the tserver logs?
>
> Keith
>
> On Wed, Nov 28, 2012 at 10:44 AM, Lin XIAO <[EMAIL PROTECTED]> wrote:
>>
>> n8 was an empty table created through the shell.  Here are the logs on
>> machine 10.0.0.10
>>
>> 27 11:52:25,220 [tabletserver.TabletServer] INFO : Loading tablet n8<<
>> 27 11:52:25,221 [tabletserver.TabletServer] INFO :
>> cloud9/10.0.0.10:41000: got assignment from master: n8<<
>> 27 11:52:25,221 [tabletserver.TabletServer] DEBUG: Loading extent: n8<<
>> 27 11:52:25,221 [tabletserver.TabletServer] DEBUG: verifying extent n8<<
>> 27 11:52:25,223 [tabletserver.Tablet] DEBUG: Looking at metadata {n8<
>> future:43b1b039a081368 [] 423355 false=10.0.0.10:41000, n8< srv:dir []
>> 423354 false=/default_tablet, n8< srv:lock [] 423354
>> false=masters/lock/zlock-0000000184$43b1b039a08ad85, n8< srv:time []
>> 423354 false=M0, n8< ~tab:~pr [] 423354 false=}
>> 27 11:52:25,223 [tabletserver.Tablet] DEBUG: got [] for logs for n8<<
>> 27 11:52:25,230 [tabletserver.Tablet] TABLET_HIST: n8<< opened
>>
>> Thanks,
>> Lin
>>
>> On Wed, Nov 28, 2012 at 8:55 AM, Keith Turner <[EMAIL PROTECTED]> wrote:
>> > Can you look at the logs for tablet server 10.0.0.10 and see what was
>> > going
>> > on with tablet n8<<?
>> >
>> > Keith
>> >
>> >
>> > On Tue, Nov 27, 2012 at 6:20 PM, Lin XIAO <[EMAIL PROTECTED]> wrote:
>> >>
>> >> I've only went through the master log generated today for FAILED
>> >> transactions.
>> >> CreateTable operations failed because the table already exist while
>> >> the DeleteTable failed because the table doesn't exist. I think the
>> >> user run his hadoop jobs several times with same table names. If the
>> >> table cannot be deleted, the following create operations will fail.
>> >> I'm not sure why he tried to delete an non-existed table though.
>> >>
>> >> 27 04:52:16,547 [fate.Fate] WARN : Failed to execute Repo,
>> >> tid=1f4c647a48c383a6
>> >> ThriftTableOperationException(tableId:gf, tableName:, op:DELETE,
>> >> type:NOTFOUND, description:Table does not exists)
>> >> at
>> >>
>> >> org.apache.accumulo.server.master.tableOps.Utils.reserveTable(Utils.java:82)
>> >> at
>> >>
>> >> org.apache.accumulo.server.master.tableOps.DeleteTable.isReady(DeleteTable.java:224)
>> >> at
>> >>
>> >> org.apache.accumulo.server.master.tableOps.DeleteTable.isReady(DeleteTable.java:212)
>> >> at
>> >>
>> >> org.apache.accumulo.server.master.tableOps.TraceRepo.isReady(TraceRepo.java:50)
>> >> at
>> >>
>> >> org.apache.accumulo.server.fate.Fate$TransactionRunner.run(Fate.java:62)
>> >> at
>> >>
>> >> org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
>> >> at java.lang.Thread.run(Thread.java:662)
>> >> 27 04:52:16,564 [zookeeper.DistributedReadWriteLock] DEBUG: Removing
+
Keith Turner 2012-11-28, 16:56