Monday, 30 May 2011

Detecting and Ending Deadlocks


A deadlock occurs when two or more tasks permanently block each other by each task having a lock on a resource which the other tasks are trying to lock. The following graph presents a high level view of a deadlock state where:
  • Task T1 has a lock on resource R1 (indicated by the arrow from R1 to T1) and has requested a lock on resource R2 (indicated by the arrow from T1 to R2).
  • Task T2 has a lock on resource R2 (indicated by the arrow from R2 to T2) and has requested a lock on resource R1 (indicated by the arrow from T2 to R1).
  • Because neither task can continue until a resource is available and neither resource can be released until a task continues, a deadlock state exists.
Diagram showing tasks in a deadlock state
The SQL Server Database Engine automatically detects deadlock cycles within SQL Server. The Database Engine chooses one of the sessions as a deadlock victim and the current transaction is terminated with an error to break the deadlock.
 Resources That Can Deadlock
Each user session might have one or more tasks running on its behalf where each task might acquire or wait to acquire a variety of resources. The following types of resources can cause blocking that could result in a deadlock.
  • Locks. Waiting to acquire locks on resources, such as objects, pages, rows, metadata, and applications can cause deadlock. For example, transaction T1 has a shared (S) lock on row r1 and is waiting to get an exclusive (X) lock on r2. Transaction T2 has a shared (S) lock on r2 and is waiting to get an exclusive (X) lock on row r1. This results in a lock cycle in which T1 and T2 wait for each other to release the locked resources.
  • Worker threads. A queued task waiting for an available worker thread can cause deadlock. If the queued task owns resources that are blocking all worker threads, a deadlock will result. For example, session S1 starts a transaction and acquires a shared (S) lock on row r1 and then goes to sleep. Active sessions running on all available worker threads are trying to acquire exclusive (X) locks on row r1. Because session S1 cannot acquire a worker thread, it cannot commit the transaction and release the lock on row r1. This results in a deadlock.
  • Memory. When concurrent requests are waiting for memory grants that cannot be satisfied with the available memory, a deadlock can occur. For example, two concurrent queries, Q1 and Q2, execute as user-defined functions that acquire 10MB and 20MB of memory respectively. If each query needs 30MB and the total available memory is 20MB, then Q1 and Q2 must wait for each other to release memory, and this results in a deadlock.
  • Parallel query execution-related resources Coordinator, producer, or consumer threads associated with an exchange port may block each other causing a deadlock usually when including at least one other process that is not a part of the parallel query. Also, when a parallel query starts execution, SQL Server determines the degree of parallelism, or the number of worker threads, based upon the current workload. If the system workload unexpectedly changes, for example, where new queries start running on the server or the system runs out of worker threads, then a deadlock could occur.
  • Multiple Active Result Sets (MARS) resources. These resources are used to control interleaving of multiple active requests under MARS (see Batch Execution Environment and MARS [ http://msdn.microsoft.com/en-us/library/ms187051(SQL.90).aspx ] ).
    • User resource. When a thread is waiting for a resource that is potentially controlled by a user application, the resource is considered to be an external or user resource and is treated like a lock.
    • Session mutex. The tasks running in one session are interleaved, meaning that only one task can run under the session at a given time. Before the task can run, it must have exclusive access to the session mutex.
    • Transaction mutex. All tasks running in one transaction are interleaved, meaning that only one task can run under the transaction at a given time. Before the task can run, it must have exclusive access to the transaction mutex.
In order for a task to run under MARS, it must acquire the session mutex. If the task is running under a transaction, it must then acquire the transaction mutex. This guarantees that only one task is active at one time in a given session and a given transaction. Once the required mutexes have been acquired, the task can execute. When the task finishes, or yields in the middle of the request, it will first release transaction mutex followed by the session mutex in reverse order of acquisition. However, deadlocks can occur with these resources. In the following code example, two tasks, user request U1 and user request U2, are running in the same session.

U1:    Rs1=Command1.Execute("insert sometable EXEC usp_someproc");
U2:    Rs2=Command2.Execute("select colA from sometable");
The stored procedure executing from user request U1 has acquired the session mutex. If the stored procedure takes a long time to execute, it is assumed by the Database Engine that the stored procedure is waiting for input from the user. User request U2 is waiting for the session mutex while the user is waiting for the result set from U2, and U1 is waiting for a user resource. This is deadlock state logically illustrated as:
Logic diagram showing user process deadlock.
 Deadlock Detection
All of the resources listed in the section above participate in the Database Engine deadlock detection scheme. Deadlock detection is performed by a lock monitor thread that periodically initiates a search through all of the tasks in an instance of the Database Engine. The following points describe the search process:
  • The default interval is 5 seconds.
  • If the lock monitor thread finds deadlocks, the deadlock detection interval will drop from 5 seconds to as low as 100 milliseconds depending on the frequency of deadlocks.
  • If the lock monitor thread stops finding deadlocks, the Database Engine increases the intervals between searches to 5 seconds.
  • If a deadlock has just been detected, it is assumed that the next threads that must wait for a lock are entering the deadlock cycle. The first couple of lock waits after a deadlock has been detected will immediately trigger a deadlock search rather than wait for the next deadlock detection interval. For example, if the current interval is 5 seconds, and a deadlock was just detected, the next lock wait will kick off the deadlock detector immediately. If this lock wait is part of a deadlock, it will be detected right away rather than during next deadlock search.
The Database Engine typically performs periodic deadlock detection only. Because the number of deadlocks encountered in the system is usually small, periodic deadlock detection helps to reduce the overhead of deadlock detection in the system.
When the lock monitor initiates deadlock search for a particular thread, it identifies the resource on which the thread is waiting. The lock monitor then finds the owner(s) for that particular resource and recursively continues the deadlock search for those threads until it finds a cycle. A cycle identified in this manner forms a deadlock.
After a deadlock is detected, the Database Engine ends a deadlock by choosing one of the threads as a deadlock victim. The Database Engine terminates the current batch being executed for the thread, rolls back the transaction of the deadlock victim, and returns a 1205 error to the application. Rolling back the transaction for the deadlock victim releases all locks held by the transaction. This allows the transactions of the other threads to become unblocked and continue. The 1205 deadlock victim error records information about the threads and resources involved in a deadlock in the error log.
By default, the Database Engine chooses as the deadlock victim the session running the transaction that is least expensive to roll back. Alternatively, a user can specify the priority of sessions in a deadlock situation using the SET DEADLOCK_PRIORITY statement. DEADLOCK_PRIORITY can be set to LOW, NORMAL, or HIGH, or alternatively can be set to any integer value in the range (-10 to 10). The deadlock priority defaults to NORMAL. If two sessions have different deadlock priorities, the session with the lower priority is chosen as the deadlock victim. If both sessions have the same deadlock priority, the session with the transaction that is least expensive to roll back is chosen. If sessions involved in the deadlock cycle have the same deadlock priority and the same cost, a victim is chosen randomly.
When working with CLR, the deadlock monitor automatically detects deadlock for synchronization resources (monitors, reader/writer lock and thread join) accessed inside managed procedures. However, the deadlock is resolved by throwing an exception in the procedure that was selected to be the deadlock victim. It is important to understand that the exception does not automatically release resources currently owned by the victim; the resources must be explicitly released. Consistent with exception behavior, the exception used to identify a deadlock victim can be caught and dismissed.
 Deadlock Information Tools
To view deadlock information, the Database Engine provides monitoring tools in the form of two trace flags, and the deadlock graph event in SQL Server Profiler.

Trace Flag 1204 and Trace Flag 1222

When deadlocks occur, trace flag 1204 and trace flag 1222 return information that is captured in the SQL Server 2005 error log. Trace flag 1204 reports deadlock information formatted by each node involved in the deadlock. Trace flag 1222 formats deadlock information, first by processes and then by resources. It is possible to enable both trace flags to obtain two representations of the same deadlock event.
In addition to defining the properties of trace flag 1204 and 1222, the following table also shows the similarities and differences.
Property
Trace Flag 1204 and Trace Flag 1222
Trace Flag 1204 only
Trace Flag 1222 only
Output format
Output is captured in the SQL Server 2005 error log.
Focused on the nodes involved in the deadlock. Each node has a dedicated section, and the final section describes the deadlock victim.
Returns information in an XML-like format that does not conform to an XML Schema Definition (XSD) schema. The format has three major sections. The first section declares the deadlock victim. The second section describes each process involved in the deadlock. The third section describes the resources that are synonymous with nodes in trace flag 1204.
Identifying attributes
SPID:<x> ECID:<x>. Identifies the system process ID thread in cases of parallel processes. The entry SPID:<x> ECID:0, where <x> is replaced by the SPID value, represents the main thread. The entry SPID:<x> ECID:<y>, where <x> is replaced by the SPID value and <y> is greater than 0, represents the sub-threads for the same SPID.
BatchID (sbid for trace flag 1222). Identifies the batch from which code execution is requesting or holding a lock. When Multiple Active Result Sets (MARS) is disabled, the BatchID value is 0. When MARS is enabled, the value for active batches is 1 to n. If there are no active batches in the session, BatchID is 0.
Mode. Specifies the type of lock for a particular resource that is requested, granted, or waited on by a thread. Mode can be IS (Intent Shared), S (Shared), U (Update), IX (Intent Exclusive), SIX (Shared with Intent Exclusive), and X (Exclusive). For more information, see Lock Modes [ http://msdn.microsoft.com/en-us/library/ms175519(SQL.90).aspx ] .
Line # (line for trace flag 1222). Lists the line number in the current batch of statements that was being executed when the deadlock occurred.
Input Buf (inputbuf for trace flag 1222). Lists all the statements in the current batch.
Node. Represents the entry number in the deadlock chain.
Lists. The lock owner can be part of these lists:
  • Grant List. Enumerates the current owners of the resource.
  • Convert List. Enumerates the current owners that are trying to convert their locks to a higher level.
  • Wait List. Enumerates current new lock requests for the resource.
Statement Type. Describes the type of DML statement (SELECT, INSERT, UPDATE, or DELETE) on which the threads have permissions.
Victim Resource Owner. Specifies the participating thread that SQL Server chooses as the victim to break the deadlock cycle. The chosen thread and all existing sub-threads are terminated.
Next Branch. Represents the two or more sub-threads from the same SPID that are involved in the deadlock cycle.
deadlock victim. Represents the physical memory address of the task (see sys.dm_os_tasks [ http://msdn.microsoft.com/en-us/library/ms174963(SQL.90).aspx ] ) that was selected as a deadlock victim. It may be 0 (zero) in the case of an unresolved deadlock. A task that is rolling back cannot be chosen as a deadlock victim.
executionstack. Represents Transact-SQL code that is being executed at the time the deadlock occurs.
priority. Represents deadlock priority. In certain cases, the Database Engine may opt to alter the deadlock priority for a short duration to achieve better concurrency.
logused. Log space used by the task.
owner id. The ID of the transaction that has control of the request.
status. State of the task. It is one of the following values:
  • pending. Waiting for a worker thread.
  • runnable. Ready to run but waiting for a quantum.
  • running. Currently running on the scheduler.
  • suspended. Execution is suspended.
  • done. Task has completed.
  • spinloop. Waiting for a spinlock to become free.
waitresource. The resource needed by the task.
waittime. Time in milliseconds waiting for the resource.
schedulerid. Scheduler associated with this task. See sys.dm_os_schedulers [ http://msdn.microsoft.com/en-us/library/ms177526(SQL.90).aspx ] .
hostname. The name of the workstation.
isolationlevel. The current transaction isolation level.
Xactid. The ID of the transaction that has control of the request.
currentdb. The ID of the database.
lastbatchstarted. The last time a client process started batch execution.
lastbatchcompleted. The last time a client process completed batch execution.
clientoption1 and clientoption2. Set options on this client connection. This is a bitmask that includes information about options usually controlled by SET statements such as SET NOCOUNT and SET XACTABORT.
associatedObjectId. Represents the HoBT (heap or b-tree) ID.
Resource attributes
RID. Identifies the single row within a table on which a lock is held or requested. RID is represented as RID: db_id:file_id:page_no:row_no. For example, RID: 6:1:20789:0.
OBJECT. Identifies the table on which a lock is held or requested. OBJECT is represented as OBJECT: db_id:object_id. For example, TAB: 6:2009058193.
KEY. Identifies the key range within an index on which a lock is held or requested. KEY is represented as KEY: db_id:hobt_id (index key hash value). For example, KEY: 6:72057594057457664 (350007a4d329).
PAG. Identifies the page resource on which a lock is held or requested. PAG is represented as PAG: db_id:file_id:page_no. For example, PAG: 6:1:20789.
EXT. Identifies the extent structure. EXT is represented as EXT: db_id:file_id:extent_no. For example, EXT: 6:1:9.
DB. Identifies the database lock. DB is represented in one of the following ways:
  • DB: db_id
  • DB: db_id[BULK-OP-DB], which identifies the database lock taken by the backup database.
  • DB: db_id[BULK-OP-LOG], which identifies the lock taken by the backup log for that particular database.
APP. Identifies the lock taken by an application resource. APP is represented as APP: lock_resource. For example, APP: Formf370f478.
METADATA. Represents metadata resources involved in a deadlock. Because METADATA has many subresources, the value returned depends upon the subresource that has deadlocked. For example, METADATA.USER_TYPE returns user_type_id = <integer_value>. For more information about METADATA resources and subresources, see sys.dm_tran_locks [ http://msdn.microsoft.com/en-us/library/ms190345(SQL.90).aspx ] .
HOBT. Represents a heap or b-tree involved in a deadlock.
None exclusive to this trace flag.
None exclusive to this trace flag.

Trace Flag 1204 Example

The following example shows the output when trace flag 1204 is turned on. In this case, the table in Node 1 is a heap with no indexes, and the table in Node 2 is a heap with a nonclustered index. The index key in Node 2 is being updated when the deadlock occurs.

Deadlock encountered .... Printing deadlock information
Wait-for graph
 
Node:1
 
RID: 6:1:20789:0               CleanCnt:3 Mode:X Flags: 0x2
Grant List 0:
   Owner:0x0315D6A0 Mode: X        
     Flg:0x0 Ref:0 Life:02000000 SPID:55 ECID:0 XactLockInfo: 0x04D9E27C
   SPID: 55 ECID: 0 Statement Type: UPDATE Line #: 6
   Input Buf: Language Event: 
BEGIN TRANSACTION
EXEC usp_p2
Requested By: 
   ResType:LockOwner Stype:'OR'Xdes:0x03A3DAD0 
     Mode: U SPID:54 BatchID:0 ECID:0 TaskProxy:(0x04976374) Value:0x315d200 Cost:(0/868)
 
Node:2
 
KEY: 6:72057594057457664 (350007a4d329) CleanCnt:2 Mode:X Flags: 0x0
Grant List 0:
   Owner:0x0315D140 Mode: X        
     Flg:0x0 Ref:0 Life:02000000 SPID:54 ECID:0 XactLockInfo: 0x03A3DAF4
   SPID: 54 ECID: 0 Statement Type: UPDATE Line #: 6
   Input Buf: Language Event: 
     BEGIN TRANSACTION
EXEC usp_p1
Requested By: 
   ResType:LockOwner Stype:'OR'Xdes:0x04D9E258 
     Mode: U SPID:55 BatchID:0 ECID:0 TaskProxy:(0x0475E374) Value:0x315d4a0 Cost:(0/380)
 
Victim Resource Owner:
ResType:LockOwner Stype:'OR'Xdes:0x04D9E258 
     Mode: U SPID:55 BatchID:0 ECID:0 TaskProxy:(0x0475E374) Value:0x315d4a0 Cost:(0/380)

Trace Flag 1222 Example

The following example shows the output when trace flag 1222 is turned on. In this case, one table is a heap with no indexes, and the other table is a heap with a nonclustered index. In the second table, the index key is being updated when the deadlock occurs.

deadlock-list
deadlock victim=process689978
  process-list
   process id=process6891f8 taskpriority=0 logused=868 
   waitresource=RID: 6:1:20789:0 waittime=1359 ownerId=310444 
   transactionname=user_transaction 
   lasttranstarted=2005-09-05T11:22:42.733 XDES=0x3a3dad0 
   lockMode=U schedulerid=1 kpid=1952 status=suspended spid=54 
   sbid=0 ecid=0 priority=0 transcount=2 
   lastbatchstarted=2005-09-05T11:22:42.733 
   lastbatchcompleted=2005-09-05T11:22:42.733 
   clientapp=Microsoft SQL Server Management Studio - Query 
   hostname=TEST_SERVER hostpid=2216 loginname=DOMAIN\user 
   isolationlevel=read committed (2) xactid=310444 currentdb=6 
   lockTimeout=4294967295 clientoption1=671090784 clientoption2=390200
    executionStack
     frame procname=AdventureWorks.dbo.usp_p1 line=6 stmtstart=202 
     sqlhandle=0x0300060013e6446b027cbb00c69600000100000000000000
     UPDATE T2 SET COL1 = 3 WHERE COL1 = 1;     
     frame procname=adhoc line=3 stmtstart=44 
     sqlhandle=0x01000600856aa70f503b8104000000000000000000000000
     EXEC usp_p1     
    inputbuf
      BEGIN TRANSACTION
EXEC usp_p1
   process id=process689978 taskpriority=0 logused=380 
   waitresource=KEY: 6:72057594057457664 (350007a4d329)   
   waittime=5015 ownerId=310462 transactionname=user_transaction 
   lasttranstarted=2005-09-05T11:22:44.077 XDES=0x4d9e258 lockMode=U 
   schedulerid=1 kpid=3024 status=suspended spid=55 sbid=0 ecid=0 
   priority=0 transcount=2 lastbatchstarted=2005-09-05T11:22:44.077 
   lastbatchcompleted=2005-09-05T11:22:44.077 
   clientapp=Microsoft SQL Server Management Studio - Query 
   hostname=TEST_SERVER hostpid=2216 loginname=DOMAIN\user 
   isolationlevel=read committed (2) xactid=310462 currentdb=6 
   lockTimeout=4294967295 clientoption1=671090784 clientoption2=390200
    executionStack
     frame procname=AdventureWorks.dbo.usp_p2 line=6 stmtstart=200 
     sqlhandle=0x030006004c0a396c027cbb00c69600000100000000000000
     UPDATE T1 SET COL1 = 4 WHERE COL1 = 1;     
     frame procname=adhoc line=3 stmtstart=44 
     sqlhandle=0x01000600d688e709b85f8904000000000000000000000000
     EXEC usp_p2     
    inputbuf
      BEGIN TRANSACTION
EXEC usp_p2    
  resource-list
   ridlock fileid=1 pageid=20789 dbid=6 objectname=AdventureWorks.dbo.T2 
   id=lock3136940 mode=X associatedObjectId=72057594057392128
    owner-list
     owner id=process689978 mode=X
    waiter-list
     waiter id=process6891f8 mode=U requestType=wait
   keylock hobtid=72057594057457664 dbid=6 objectname=AdventureWorks.dbo.T1 
   indexname=nci_T1_COL1 id=lock3136fc0 mode=X 
   associatedObjectId=72057594057457664
    owner-list
     owner id=process6891f8 mode=X
    waiter-list
     waiter id=process689978 mode=U requestType=wait

Profiler Deadlock Graph Event

This is an event in SQL Server Profiler that presents a graphical depiction of the tasks and resources involved in a deadlock. The following example shows the output from SQL Server Profiler when the deadlock graph event is turned on.
Logic flow diagram showing user process deadlock.
For more information about running the SQL Server Profiler deadlock graph, see Analyzing Deadlocks with SQL Server Profiler [ http://msdn.microsoft.com/en-us/library/ms188246(SQL.90).aspx ] .
 See Also

No comments:

Post a Comment