INF: Recovering from Media Failure in SQL Server

ID: Q46479


The information in this article applies to:


SUMMARY

This article provides guidance on how to recover as much data as possible when the disk media is physically damaged or becomes unreadable. These procedures are sometimes called "disaster recovery" or "media failure" procedures.


MORE INFORMATION

To recover user databases from media failure, re-create the lost device(s), then restore the affected database(s) from backups. Anything done after the backups were made must be done again.

To restore a backup, the target database must already exist. The target database does not have to occupy the same device(s), it only needs to have enough space allocated to hold what was in that database at the time it was dumped.

The steps necessary to restore a backup are as follows:

  1. Re-create the lost device(s) as follows:

    1. Bring up SQL Server in single-user mode.


    2. Drop each database that has space allocated on a lost device using the DBCC DBREPAIR (dbname,DROPDB) command.

      The following SQL command will list the affected databases:
      
               SELECT DISTINCT DB_NAME(DBID) FROM SYSUSAGES,SYSDEVICES WHERE
                 VSTART>=LOW AND VSTART<=HIGH AND
                 NAME IN ('devname1','devname2',...)
           


    3. Drop lost device(s) with the sp_dropdevice() function.


    4. Use the CHECKPOINT command and shut down SQL Server, then bring up SQL Server in normal mode.


    5. Re-create lost device(s) with the DISK INIT command using the VDEVNO and SIZE of the original device(s).




  2. Re-create and reload the affected database(s) as follows:

    1. Re-create the affected database(s) using the same size and device assignment as the original database(s).


    2. Assign LOG device(s) if necessary with the sp_logdevice() function.


    3. LOAD the affected database(s) and transaction log(s) from the latest backups.




Notes

SQL Server version 4.2 allows for the dumping of the transaction log if the device containing the data portion of a database has been damaged and the device containing the log portion is still intact. The transaction log can be dumped using "DUMP TRAN WITH NO_TRUNCATE" and can then be reapplied to a prior database dump (after all previous transaction log dumps have been applied) to bring the database back to the state it was in immediately prior to the media failure.

DBCC will produce error messages about not being able to access the lost database devices; however, it will still drop the rows from SYSDATABASES and SYSUSAGES.

If the information required in steps 1e and 2a is not known, it can be derived from the SYSDEVICES, SYSDATABASES, and SYSUSAGES tables in the master database.

VDEVNO and device SIZE can be derived from LOW and HIGH in SYSDEVICES as follows:

   VDEVNO is LOW divided by 16777216 (0x01000000)
     SIZE is HIGH - LOW (2K blocks) 

SIZE and device assignment of databases can be derived from SYSDEVICES, SYSUSAGES, and SYSDATABASES.

Each chunk of disk space assigned to a database is represented by a row in SYSUSAGES. The SIZE of each chunk is in the row. The dbid of the database to which the chunk is assigned is also in the row and the corresponding database name can be obtained from SYSDATABASES. The device containing the chunk of disk space can be deduced by noticing which low/high range in SYSDEVICES contains the VSTART specified in SYSUSAGES.

Whether or not a device is a log device can be determined from the value of SEGMAP in SYSUSAGES. A 7 indicates LOG and data, 4 indicates LOG only, and 3 indicates data only.

Example 1

Assume two user databases in addition to the master database and two database devices in addition to the master device. Device 1 is 4 MB and was created first (VDEVNO 1), while device 2 is 6 MB and was created after device 1 (VDEVNO 2). User database 1 was allocated with 2 MB on device 1, 2 MB on MASTER.DAT, and a 1 MB log on device 2. User database 2 was allocated with 2 MB on device 2 and 2 MB on device 1.

For the first disaster scenario, assume that user database device 1 has been lost and the other devices are intact. According to the procedure described above, the following should be performed:
  1. Re-create the lost device(s) as follows:

    1. Bring up SQL Server in single-user mode (-m option).


    2. Determine which databases are affected as follows:
      
               SELECT DISTINCT DB_NAME(DBID) FROM SYSUSAGES,SYSDEVICES WHERE
                 VSTART>=LOW AND VSTART<=HIGH AND
                 NAME IN ('device1')
         
      Drop the affected databases:
      
               DBCC DBREPAIR(database1,DROPDB)
               DBCC DBREPAIR(database2,DROPDB)
         


    3. Drop device 1 with the sp_dropdevice() function.


    4. CHECKPOINT and shut down the server, then bring it back up normally.


    5. Re-create device 1 with DISK INIT with a SIZE = 4 MB and VDEVNO = 1.




  2. Re-create and reload the affected database(s) as follows:

    1. Re-create user database 1 with 2 MB on device 1, 2 MB on MASTER.DAT, and 1 MB on device 2.

      Re-create user database 2 with 2 MB on device 2 and 2 MB on device 1.


    2. Use the sp_logdevice() function to designate the log for database 1.


    3. Load database 1 and its transaction log from backup. Load database 2 from backup.




Example 2

For the second disaster scenario, assume that both user database devices have been lost and MASTER.DAT is intact. The same procedure applies:
  1. Re-create the lost device(s) as follows:

    1. Bring up SQL Server in single-user mode (-m option).


    2. Determine which databases are affected:
      
               SELECT DISTINCT DB_NAME(DBID) FROM SYSUSAGES,SYSDEVICES WHERE
                 VSTART>=LOW AND VSTART<=HIGH AND
                 NAME IN ('device1','device2')
         
      Drop the affected databases:
      
               DBCC DBREPAIR(database1,DROPDB)
               DBCC DBREPAIR(database2,DROPDB)
         


    3. Drop devices 1 and 2 with the sp_dropdevice() function.


    4. CHECKPOINT and shut down the server, then bring it back up normally.


    5. Re-create device 1 with DISK INIT with SIZE = 4 MB and VDEVNO = 1.

      Re-create device 2 with DISK INIT with SIZE = 6 MB and VDEVNO = 2.




  2. Re-create and reload the affected database(s) as follows:

    1. Re-create user database 1 with 2 MB on device 1, 2 MB on MASTER.DAT, and 1 MB on device 2.

      Re-create user database 2 with 2 MB on device 2 and 2 MB on device 1.


    2. Use the sp_logdevice() function to designate the log for database 1.


    3. Load database 1 and its transaction log from backup.

      Load database 2 from backup.




Example 3

For the third disaster scenario, assume that all three database devices have been lost. In this case, MASTER.DAT must be restored first.

Run BLDMASTR to re-create MASTER.DAT using the original size and case-sensitivity option.

Bring up SQL Server in single-user mode and restore the latest backup of the master database. If the dump of the master was made with a user-supplied dump device, that dump device must be re-added to SYSDEVICES before the restore can take place.

The quickest way to do this is to INSERT a row into SYSDEVICES. It can also be done with the sp_addumpdevice() function; however, that approach requires the INSTMSTR.SQL script to be run first to re-add the stored procedures.

The server will shut itself down when the restore is complete.

The remainder of the procedure is identical to the second example.

Example 4

For the fourth disaster scenario, assume that only MASTER.DAT has been lost.
  1. Run BLDMASTR to re-create MASTER.DAT using the original size and case-sensitivity option.


  2. Bring up SQL Server in single-user mode and restore the latest backup of the master database. The same considerations apply as in the third disaster scenario. The server will shut itself down when the restore is complete.


If no changes were made to the master database after the most recent dump was taken, then recovery is complete. Restart the server normally. This is why it is a good idea to dump the master after such changes.

If not, then SYSDATABASES, SYSDEVICES, and SYSUSAGES must be brought up to date. If this must be done, bring the server up in single-user mode again.

Use the DISK REINIT command to re-create rows in SYSDEVICES for all database devices added after the most recent dump was taken. The DISK REINIT command is used because it updates SYSDEVICES just as the DISK INIT command does; however, it does not format the physical disk file, therefore the existing data is preserved.

Use the DISK REFIT command to re-create rows in SYSUSAGES and SYSDATABASES for all CREATE and ALTER DATABASE commands that were performed after the most recent dump was taken. The DISK REFIT command scans the physical file associated with each database device listed in SYSDEVICES. It adds entries in SYSUSAGES to define the space that is allocated to databases. It also adds the corresponding SYSDATABASES entries. Some of the information is not reconstructed perfectly. For example, the original VDEVNO is not assigned because it is not known. Instead, VDEVNOs are assigned sequentially. The database owner is not extracted while scanning the physical files; it is just set to "sa". It is not possible to determine how many SYSUSAGES entries originally existed. The DISK REFIT command inserts a separate entry for each different segment type.

When this is done, correct the entries made by DISK REFIT to SYSDATABASES and SYSUSAGES (if desired) and also add to SYSLOGINS any login-ids that were not captured by the most recent dump. Then, shut down the server and bring it back up normally.

Additional query words: 4.20


Keywords          : SSrvAdmin SSrvServer 
Version           : 4.20
Platform          : OS/2 
Issue type        : 

Last Reviewed: March 6, 1999