INF: Optimization Techniques for OS/2 SQL Server

ID: Q69329

The information in this article applies to:

Microsoft SQL Server for OS/2, version 4.2

SUMMARY

This article describes optimization techniques and configuration options for SQL Server version 4.2 for OS/2.

MORE INFORMATION

System Environment Optimization

The following are suggestions for changes to the system environment to optimize SQL Server:

Turn off the PM spooler for OS/2 releases before 2.0. Use the Control Panel to do this.

Turn off the MS-DOS session by editing the CONFIG.SYS file as follows:
1. Set PROTECT ONLY = Yes.
2. REMark out the EGA.SYS and ANSI.SYS entries (if present).
This will save 640K of memory and prevent possible problems when executing real-mode applications.

Turn off the LAN Manager disk cache.

If you are running a dedicated SQL Server, performance can be improved by turning off the LAN Manager disk cache capabilities, thus giving more memory to SQL Server. SQL Sever handles its own disk cache.

To turn off the disk cache, edit the CONFIG.SYS file as follows:

REM diskcache

Move the swap file (SWAPPER.DAT) to its own partition, which has a large amount of disk space and little activity.

By default, the environment variable SWAPPATH = path size is located in the OS/2 directory.

SQL Server Optimization

The following describes different methods within SQL Server to optimize performance:

Configuration options
1. Memory
  
  A rule of thumb to follow for allocating memory is:
```
      Server Type               Amount of RAM
      -----------               --------------

      Dedicated server          80 percent of RAM

      Nondedicated server       50-60 percent of RAM
    
```
  Memory is allocated to SQL Server in 2K units. (Note: SAF specifies memory in kilobytes.) The following is a formula that illustrates how much memory should be allocated for a dedicated SQL Server with 8 MB of RAM:
```
         (8192K * .8)/2 = 3276 2K pages, which leaves about 1.5 MB to
         be used by OS/2 and LAN Manager
    
```
  Note: Memory allocated to SQL Server should never exceed 90 percent of available system memory because this will cause swapping and will defeat "in memory" access.
2. User connections
  
  The number of user connections must be high enough to cover all connections to SQL Server. Each user connection requires about 42K. Static memory is allocated for user connections at run time, and the remaining memory is divided between the procedure and the data cache.
  
  Requirements
  ------------
  1. One connection per dbopen() in a DB-Library program. Most applications require between one and three connections: one connection per utility program run simultaneously (ISQL, BCP, etc.).
  2. One connection for the Console program, plus a new connection for reading/writing when the program is started.
  The following formula can be used when calculating memory requirements for user connections:
```
             1 Connection for Console Program
        +    # Max Number of Simultaneous Logins
             # Max Number of Connections
     
```
  The following illustrates how to calculate the memory required for 40 simultaneous logins:
```
 
             (((Number of Simultaneous Logins * Average Number of
                Connections Per Application)
        +    Connection Required for Console Utility)
        *    Memory Required Per Connection)
             Total Amount of Memory Required
             for 40 Simultaneous Logins
     
```
  -or-
```
             (((40 * 4) + 1) * 42K) = 6762K
     
```
  NOTE: The average number of user connections required per application is calculated by adding up all dbopen()s in the applications and dividing the sum by the total number of applications. For example:
```
          3    dbopen()s for Application 1
          3    dbopen()s for Application 2
       +  6    dbopen()s for Application 3
          12   Total of All dbopen()s
       /  3    Total Number of Applications
          4    Average Number of Connections Required Per Application
     
```
3. Procedure cache
  
  The memory allocated to the procedure cache is what is left over after static memory (user connections) requirements have been allocated by SQL Server at run time. The leftover memory is actually split between the procedure cache and the data cache. However, the percentage allocated to procedure the cache is configurable. The default value is 20 percent and can be changed using the sp_configure system command.
  
  The procedure cache holds internal structures for rules, stored procedures, and triggers; that is, anything that is compiled. The larger the procedure cache is, the less chance you will have to recompile any triggers, rules, and so on.
  
  The procedure cache typically must be higher during development because of the high level of compilation that occurs.
4. Data cache
  
  The size of the data cache affects checkpoint time. The larger the data cache, the more time needed to align completed transactions to disk.
5. Recovery interval
  
  The recovery controls the frequency of a checkpoint. A checkpoint aligns all completed transactions to a disk and writes out the last page of the log. The guidelines to follow when setting this option are:
  - A low number results in faster retrieval and slower updates because of the increased number of checkpoints and the associated overhead of I/O.
  - A high number results in slower retrieval but faster updates because of the decrease in the number of checkpoints.
6. Time slice
  
  The SQL Server kernel is not preemptive. The kernel does not control time spent on a process; the process controls the time spent in the CPU and "schedules itself out." However, if the process does give itself up (that is, exceeds the time limit set in the time slice configuration), the kernel will terminate the process because it assumes that the program is in a hard loop.
  
  If the time slice is set too low, the system may slow down due to overhead caused by the system "thrashing"; that is, the programs are frequently scheduling themselves in and out of the CPU.
  
  If the time slice is set too high, it may cause long response time when one process fails to schedule itself out of the CPU for a long time.

Indexes
1. Clustered versus nonclustered indexes
  
  The key question is: what is more important, retrieval or update performance?
  
  With a clustered index, the tuples are sequenced in addition to the index data being sequenced. With a nonclustered index, only the data of the index is clustered, or sequenced.
  
  Thus, a clustered index provides faster retrieval. Less space is used because one level of the index tree is omitted. However, updates are slower because of the overhead associated with the sequencing of the tuples.
2. Number of indexes on a table
  
  Indexes must be updated when their tables are updated. If there are a lot of indexes for a table, updates, inserts, and deletes may be slow.
3. Fill factor
  
  A fill factor can be used to specify the percentage that an index data page should be filled to allow for future expansion.
```
      Percentage    Explanation
      ----------    -----------

      100%          Indexes are completely filled. This is optimal for
                    quick retrieval purposes but is not optimal for
                    updates because of page overflows (page splits).

      50-60%        This is preferred for tables that are updated
                    because of a lower occurrence of page splits
                    and/or shrinks, but it makes for slower retrieval
                    time.
    
```
4. UPDATE STATISTICS
  
  The UPDATE STATISTICS command should be run on a table after a significant number of updates have occurred to provide the query optimizer with current statistics from which to create optimal query plans.
  
  Statistics are automatically updated if data is present in the table at the time of the index creation. However, if data is not present in the table, the UPDATE STATISTICS command must be executed.

Queries
1. SET commands
  
  These commands can be used to monitor the execution of queries for optimization. For more information on what these commands are and how they can be used, see page 195 in the Microsoft SQL Server Language Reference version 1.1.
2. Stored procedures
  
  Stored procedures are faster than ad-hoc queries because their execution plans are stored in the procedure cache, thus requiring less I/O. However, stored procedures can be slower when a search value cannot be determined until run time; that is, a parameter is being passed to the stored procedure.
  
  Using control of flow language in a stored procedure cuts down on network traffic because it gives control to the CPU.


Keywords          : kbenv kbother SSrvServer 
Version           : 4.2
Platform          : OS/2 
Issue type        :

Last Reviewed: March 10, 1999