INF: SQL Server and Caching Disk ControllersID: Q86903
|
Use of a write caching (also called write back caching) disk controller can
improve SQL Server performance. Write caching controllers and disk
subsystems are safe for SQL Server, if they are specifically designed for
use in a data critical transactional database management system (DBMS)
environment. These design features must preserve cached data if a system
failure occurs. Using an external uninterruptible power supply (UPS) to
achieve this is generally not sufficient, because failure modes that are
unrelated to power can occur.
Caching controllers and disk subsystems can be safe for use by SQL Server.
Most new purpose-built server platforms that incorporate these are safe.
However, you should check with your hardware vendor to be sure that the
disk subsystem has been specifically tested and approved for use in a data
critical transactional relational database management system (RDBMS)
environment.
SQL Server data modification statements generate logical page writes. This
stream of writes can be pictured as going two places: the log and the
database itself. For performance reasons, SQL Server defers writes to the
database via its own cache buffer system. Writes to the log are only
momentarily deferred until COMMIT time. They are not cached in the same
manner as data writes. Because log writes for a given page always precede
the page's data writes, the log is sometimes referred to as a "write-ahead"
log.
Transactional integrity is one of the fundamental concepts of a relational
database system. Transactions are considered to be atomic units of work
that are either totally applied or totally rolled back. The SQL Server
write-ahead transaction log is a vital component in implementing
transactional integrity.
Any relational database system must also deal with a concept closely
related to transactional integrity, which is recovery from unplanned system
failure. A variety of non-ideal, real-world effects may cause this failure.
On many database management systems, system failure may result in a lengthy
human-directed manual recovery process.
In contrast, the SQL Server recovery mechanism is completely automatic and
operates without human intervention. For example, SQL Server could be
supporting a mission-critical production application, and experience a
system failure due to a momentary power fluctuation. Upon restoration of
power, the server hardware would restart, networking software would load
and initialize, and SQL Server would restart. As SQL Server initializes, it
will automatically run its recovery process based on data in the
transaction log. This entire process occurs without human intervention.
Whenever the client workstations restarted, users would find all of their
data present, up to the last transaction they entered.
SQL Server transactional integrity and automatic recovery constitute a very
powerful time-and-labor saving capability. If a write caching controller is
not properly designed for use in a data critical transactional DBMS
environment, it may compromise the ability of SQL Server to recover, hence
corrupting the database. This can occur if the controller intercepts SQL
Server transaction log writes and buffers them in a hardware cache on the
controller board, but does not preserve these written pages during a system
failure.
Most caching controllers perform write caching. The write caching
function cannot always be disabled.
Even if the server uses a UPS, this does not guarantee the security of the
cached writes. Many types of system failures can occur that a UPS does not
address. For example, a memory parity error, an operating system trap, or a
hardware glitch that causes a system reset can produce an uncontrolled
system interruption. A memory failure in the hardware write cache can also
result in the loss of vital log information.
Another possible problem related to a write-caching controller may occur at
system shutdown. It is not uncommon to "cycle" the operating system or
reboot the system during configuration changes. Even if a careful operator
follows the operating system recommendation to wait until all disk activity
has ceased before rebooting the system, cached writes can still be present
in the controller. When the CTRL+ALT+DEL key combination is pressed, or the
RESET button is pressed, cached writes can be discarded, potentially
damaging the database.
It is possible to design a hardware write cache which takes into account
all possible causes of discarding dirty cache data, which would thus be
safe for use by a database server. Some of these design features would
include intercepting the RST bus signal to avoid uncontrolled reset of the
caching controller, on-board battery backup, and mirrored or ERC (error
checking & correcting) memory. Check with your hardware vendor to ensure that the write cache includes these and any other features necessary to avoid data loss.
Additional query words: prodsms
Keywords : kb3rdparty kbhw SSrvServer
Version : winnt:4.2x,6.0,6.5,7.0
Platform : winnt
Issue type : kbinfo
Last Reviewed: March 20, 1999