XADM: Exchange DB and Caching Hard Disks and ControllersID: Q188589
|
This article explains in detail how the use of write-caching hard disk and
hard disk controllers can affect the transactional integrity of the
Microsoft Exchange Server Exchange database engine.
Use of a write-caching disk controller can seriously jeopardize the
normally reliable the Exchange database data integrity. Significant data
corruption can result from a system failure when a write-caching controller
without a extremely reliable battery backup is used. This type of controller can
compromise the normally reliable Exchange database recovery mechanism.
Recent advances in hardware design coupled with a need for high disk
performance on server platforms make it increasingly likely that an
Exchange Server hardware platform uses a write-caching disk controller. It
is advisable to determine whether a given Exchange Server computer has a
write-caching controller, or whether the disk drives themselves contain a
write cache. You should check with the hardware vendor for this
information. Explain to the vendor that your system is to be used as a
messaging server, and the write-ahead log mechanism of the message store's
database generally requires that writes not be cached. If you plan to turn
caching on the controller on for performance reasons, you must ensure that
cached writes will not be lost in the case of a system failure. The
controller must provide battery backup and other fault tolerance measures.
Generally to meet these criteria, the hardware write caching mechanism on
the server must be designed with a messaging/database server in mind. It is
technically possible for a hardware write cache to be safe for Exchange
Server, but only if certain criteria is met by the hardware write cache
design. Essentially all possible conditions that could result in the
discarding of dirty or updated pages in the write cache must be considered
and protected against.
Disk drive write caching is always considered dangerous and is not
recommended.
The Exchange database engine's data modification statements generate logical
page writes. This stream of writes can be pictured as going two places: the
log and the database itself. For performance reasons, the Exchange database
defers writes to the database through its own cache buffer system. Writes
to the log are only momentarily deferred until COMMIT time. They are not
cached in the same manner as writes to the database. Because log writes for
a given page always precedes the page's writes to the database, the log is
sometimes referred to as a "write-ahead" log.
Transactional integrity is one of the fundamental concepts of a relational
database system. Transactions are considered to be atomic units of work
that are either totally applied or totally rolled back. The Exchange database
write-ahead transaction log is a vital component in implementing
transactional integrity.
Any relational database system must also deal with a concept closely
related to transactional integrity, which is recovery from unplanned system
failure. A variety of non-ideal, real-world effects may cause this failure.
On many database management systems, system failure may result in a lengthy
human-directed manual recovery process.
In contrast, the Exchange database recovery mechanism is completely automatic
and operates without human intervention. For example, Exchange Server could
be supporting a mission-critical production application, and experience a
system failure due to a momentary power fluctuation. Upon restoration of
power, the server hardware would restart, networking software would load
and initialize, and Exchange Server would restart. As Exchange Server[ASCII
146]s Exchange database initializes, it will automatically run its recovery
process based on data in the transaction log. This entire process occurs
without human intervention. Whenever the client workstations are restarted,
users will find all of their data present, up to the last transaction they
entered.
Exchange database transactional integrity and automatic recovery constitute a
very powerful time-and-labor saving capability. Unfortunately, use of a
write-caching disk controller can compromise the ability of the Exchange
database to recover. Such a controller intercepts Exchange database transaction
log writes, buffering them in a hardware cache on the controller board.
This improves performance significantly, but if system failure occurs for
any reason, the volatile data in the hardware cache may be lost,
jeopardizing data integrity.
Most caching controllers perform write caching. The write caching function
cannot always be disabled.
Even if the server uses an uninterruptible power supply (UPS), this does
not guarantee the security of the cached writes. Many types of system
failures can occur that a UPS does not address. For example, a memory
parity error, an operating system trap, or a hardware glitch that causes a
system reset can produce an uncontrolled system interruption. A memory
failure in the hardware write cache can also result in the loss of vital
log information.
Another possible problem related to a write-caching controller may occur at
system shutdown. It is not uncommon, if the operating system is taking a
long time to shutdown gracefully, to become impatient and "cycle" the
machine manually. When the power to the machine is turned off or the RESET
button is pressed before the operating system has shutdown completely,
cached writes can be discarded, potentially damaging the database.
It is possible to design a hardware write cache that takes into account
all possible causes of discarding dirty cache data, which would thus be
safe for use by a database server. Some of these design features would
include intercepting the RST bus signal to avoid uncontrolled reset of the
caching controller, on-board battery backup, and mirrored or ERC (error
checking and correcting) memory. Check with your hardware vendor to ensure
that the write cache includes these and any other features necessary to
avoid data loss.
Keywords : XADM
Version : WINDOWS:4.0,5.0,5.5
Platform : WINDOWS
Issue type : kbinfo
Last Reviewed: April 20, 1999