INF: Understanding How the Transact-SQL KILL Command Works
ID: Q171224
|
The information in this article applies to:
-
Microsoft SQL Server versions 6.0, 6.5
SUMMARY
The Transact-SQL KILL command is used to abruptly end a SQL Server process.
Each process is often called a System Process ID (spid). The SQL Enterprise
Manager Kill Process button under Current Activity merely sends a
Transact-SQL KILL command to the server, so the server-side KILL mechanism
is the same in this case.
A spid may respond to the KILL command immediately, or after a delay, or
not at all. A delayed or non-responsive KILL command can be normal under
some conditions. This article discusses how the KILL command works, what
these delayed or non-response conditions are, and how to identify them.
MORE INFORMATION
Each database connection forms a row in sysprocesses, sometimes called a
spid or System Process ID. In SQL Server terminology, each connection is
also called a "process," but this does not imply a separate process context
in the usual sense. In SQL Server 6.0 and 6.5, each process is roughly
analogous to and serviced by a separate operating system thread. Each
database connection also consists of server data structures that keep track
of process status, transaction state, locks held, and so on. One of these
structures is called the Process Slot Structure (PSS), of which there is
one per connection. The server scans the list of PSSs to materialize the
sysprocesses virtual table. The CPU and physical_io columns from
sysprocesses are derived from the equivalent values in each PSS.
The Transact-SQL KILL command posts a "kill yourself" message to the spid's
Process Slot Structure. It appears there as a status bit, which the spid
periodically interrogates. If the spid is executing a code path that does
not interrogate the PSS status field, the KILL is not honored. Some known
conditions where this situation can occur are given below. Most of these
are considered expected behavior, and are not considered bugs.
Spid Is Waiting on a Network I/O
If the client does not fetch all result rows, the server will eventually be
forced to wait when writing to the client. This is seen as a
sysprocesses.waittype of 0x0800. While waiting on the network, no SQL
Server code is being run that can interrogate the PSS and detect a KILL
command. If the spid holds locks prior to waiting on the network I/O, it
may block other processes.
If the network connection times out or is manually canceled, the SQL thread
waiting on the network I/O will take an error return, thus being freed up
to scan its PSS and respond to a KILL. You can manually close a named pipes
connection with the NET SESSION or NET FILES command, or equivalent Server
Manager command. Other IPC sessions such as TCP/IP and SPX/IPX cannot be
manually closed, and the only option in this case is adjusting the session
timeout for the particular IPC to a shorter value. For more information,
see the following article in the Microsoft Knowledge Base:
Q137983
: How to Troubleshoot Orphaned Connections in SQL Server
The Spid Is Rolling Back (Also Called Being "in Backout")
If the transaction aborts for any reason, it must roll back. If it is a
long-running transaction, it may take as long to roll back as to apply the
transaction. This includes long-running implicit transactions such as
single SELECT INTO, DELETE, or UPDATE statements. While it is rolling back
it cannot be killed; otherwise, the transactional changes would not be
backed out consistently.
The unkillable rollback scenario can often be identified by observing the
sp_who output, which may indicate the ROLLBACK command. On SQL Server
version 6.5 Service Pack 2 or later, a ROLLBACK status has been added to
sysprocesses.status, which will also appear in sp_who output or the SQL
Enterprise Manager "current activity" screen. However, the most reliable
way to get this information is to inspect the DBCC PSS of the blocking SPID
in question, and observing the pstat value. For example:
dbcc traceon(3604) /* Return subsequent DBCC output to the client rather
than to the errorlog. */
go
SELECT SUID FROM SYSPROCESSES WHERE SPID=<unkillable SPID number>
go
DBCC PSS (suid, spid, 0) /* Where suid is from above, and spid is the
unkillable SPID number. */
go
The first line of returned information will contain the pstat value.
For example, it may be something like the following:
pstat=0x4000, 0x800, 0x100, 0x1
Meaning of pstat bits:
0x4000 -- Delay KILL and ATTENTION signals if inside a critical section
0x2000 -- Process is being killed
0x800 -- Process is in backout, thus cannot be chosen as deadlock victim
0x400 -- Process has received an ATTENTION signal, and has responded by
raising an internal exception
0x100 -- Process in the middle of a single statement transaction
0x80 -- Process is involved in multi-database transaction
0x8 -- Process is currently executing a trigger
0x2 -- Process has received KILL command
0x1 -- Process has received an ATTENTION signal
The pstat value above would be a typical situation if a long-running data
modification was canceled (for example, by clicking the Cancel Query button
on a GUI application), and then the SPID was found (for a time) to block
users, yet be unkillable. This situation is normal; the transaction must be
backed out. It can be identified by the bits, as noted above.
Spid 1 Has a Status of 0000 (Running Recovery)
When starting (or restarting) SQL Server, each database must complete
startup recovery before it can be used. This is seen as the first spid in
sp_who having a status of 0000. It cannot be killed, and recovery should be
allowed to run to completion without restarting the server. Only user spids
can be killed, not system spids such as lazywriter, checkpoint, RA Manager,
and so on. You also cannot kill your own spid. You can find which is your
spid by doing SELECT @@SPID.
Server Has Intentionally Delayed Honoring KILL
In a few situations, the server intentionally defers acting on a KILL
command or ATTENTION signal (a query cancellation request). An example of
this is while in a critical section. These intervals are usually brief.
This situation can be seen as a pstat value of 0x4000.
Code Path Does Not Check for KILL
If you have eliminated each of the above scenarios, it is possible that the
current code path simply is not checking for KILL. For example, before SQL
Server version 6.5 Service Pack 3, DBCC CHECKDB did not reliably respond to
KILL because certain code paths did not check it. If all of the above
situations have been excluded (that is, the user process is not waiting on
I/O and not in rollback, the database is not in recovery, and SQL Server is
not intentionally deferring KILL) yet KILL is not being honored, it may be
possible to enhance the server so that KILL works. To make this
determination, each case must be individually examined by your primary
support provider.
Other Information
The fact that the message "Process id 10 killed by Hostname JOE" is written
to the errorlog does not confirm the KILL has actually taken place. This
message is written immediately following making the kill request, but does
not signify that the KILL has been honored.
Additional query words:
terminate terminated T-SQL
Keywords : kbusage SSrvGen SSrvTran_SQL
Version : 6.0 6.5
Platform : winnt
Issue type : kbinfo
Last Reviewed: April 15, 1999