FIX: SQL Server Service Stopped When IsAlive Fails to Connect

ID: Q185806


The information in this article applies to:

BUG #: Windows NT: 17812 (6.50)

SYMPTOMS

A SQL Server computer that is enabled for clustering with Microsoft Cluster Server (MSCS), may experience sudden stops of the SQL Server service and subsequent restarts of the service. The last error log will be terminated with the following message:

SQL Server terminating due to 'stop' request from Service Control Manager


CAUSE

The Resource DLL for the SQL Server service exports two functions used by the MSCS Cluster Manager to check for availability of the SQL Server resource at predefined intervals. There is a simple check, LooksAlive, that queries the service status through the Windows NT Service Control Manager, and a more stringent check, IsAlive, that connects to SQL Server as user "probe" and performs a simple query to the system catalog. By default, LooksAlive is fired every 5 seconds and IsAlive is fired every 60 seconds.

IsAlive uses a fixed login time-out of 15 seconds to connect to SQL Server. In situations where the server is very busy, SQL Server may fail to respond to the IsAlive login request within this interval. Thus IsAlive returns FALSE to the Cluster Manager, which issues a Terminate request and a subsequent Online request to SQL Server and the SQL Executive Resource DLL which causes both services to being stopped and restarted.


WORKAROUND

Increase the polling interval for SQL Server's IsAlive test in MSCS Cluster Administrator to decrease the chance for this to happen.


STATUS

Microsoft has confirmed this to be a problem in SQL Server version 6.5. This problem has been corrected in U.S. Service Pack 5a for Microsoft SQL Server version 6.5. For information about downloading and installing the latest SQL Server Service Pack, see http://support.microsoft.com/support/sql/.

For more information, contact your primary support provider.


MORE INFORMATION

If you experience sudden SQL Server restarts in a cluster environment and are unsure of the cause, enable MSCS logging by restarting the MSCS service with the system environment variable "Clusterlog=<path>". MSCS will now log all activity in the specified log file. There you will find, among others, all calls to LooksAlive and IsAlive and their outcome. In a situation where the login fails you'll find the following message in the log:

[sql65res] CheckQueryProcessorAlive: dbopen failed

Additional query words: sp sp5prodsql


Keywords          : kbbug6.50 kbfix6.50.SP5 
Version           : winnt:6.5
Platform          : winnt 
Issue type        : kbbug 

Last Reviewed: April 14, 1999