INF: Optimizing Microsoft SQL Server Performance

ID: Q110352


The information in this article applies to:


SUMMARY

To most effectively optimize Microsoft SQL Server performance, you must identify the areas that will yield the largest performance increases over the widest variety of situations, and focus analysis on these areas. Otherwise, you may expend significant time and effort on topics that may not yield sizable improvements.

For the most part, the following information does not address the performance issues stemming from multiuser concurrency. This is a separate, complex topic that is covered in the document "Maximizing Database Consistency and Concurrency," which can be found in the SQL Server version 4.2x "Programmer's Reference for C," Appendix E, and also in other Knowledge Base articles. It is not in the version 6.0 documentation, but can be found on the MSDN (Microsoft Developer Network) CD under that title.

Rather than a theoretical discussion, this article focuses primarily on areas that years of experience by the Microsoft SQL Server Support team has shown to be of practical value in real world situations.

Experience shows that the greatest benefit in SQL Server performance can be gained from the general areas of logical database design, index design, query design, and application design. Conversely, the biggest performance problems are often caused by deficiencies in these same areas. If you are concerned with performance, you should concentrate on these areas first, because very large performance improvements can often be achieved with a relatively small time investment.

While other system-level performance issues, such as memory, cache buffers, hardware, and so forth, are certainly candidates for study, experience shows that the performance gain from these areas is often incremental. SQL Server manages available hardware resources automatically, for the most part, reducing the need (and therefore, the benefit) of extensive system-level hand tuning.

Microsoft SQL Server 6.0 provides new opportunities for platform-layer performance improvements, with large amounts of memory, symmetrical multiprocessing, parallel data scan, optimizer enhancements, and disk striping. However, as large as these improvements are, they are finite in scope. The fastest computer can be bogged down with inefficient queries or a poorly designed application. Thus, even with the additional performance increase that SQL Server 6.0 allows, it is extremely important to optimize the database, index, query, and application design.

Most performance problems cannot be successfully resolved with only a server-side focus. The server is essentially a "puppet" of the client, which controls what queries are sent, and thereby what locks are obtained and released. Although some tuning is possible on the server side, successful resolution of performance problems will usually depend on acknowledging the dominant role the client plays in the problem and analyzing client application behavior.


MORE INFORMATION

The following are some suggestions that, based on experience, have yielded significant performance gains:

Normalize Logical Database Design

Reasonable normalization of the logical database design yields best performance. A greater number of narrow tables is characteristic of a normalized database. A lesser number of wide tables is characteristic of a denormalized database. A highly normalized database is routinely associated with complex relational joins, which can hurt performance. However, the SQL Server optimizer is very efficient at selecting rapid, efficient joins, as long as effective indexes are available.

The benefits of normalization include:
With SQL Server, reasonable normalization often helps rather than hurts performance. As normalization increases, so do the number and complexity of joins required to retrieve data. As a rough rule of thumb, Microsoft suggests carrying on the normalization process unless this causes many queries to have four-way or greater joins.

If the logical database design is already fixed and total redesign is not feasible, it may be possible to selectively normalize a large table if analysis shows a bottleneck on this table. If access to the database is conducted through stored procedures, this schema change could take place without impacting applications. If not, it may be possible to hide the change by creating a view that looks like a single table.

Use Efficient Index Design

Unlike many non-relational systems, relational indexes are not considered part of the logical database design. Indexes can be dropped, added, and changed without affecting the database schema or application design in any way other than performance. Efficient index design is paramount in achieving good SQL Server performance. For these reasons, you should not hesitate to experiment with different indexes.

The optimizer reliably chooses the most effective index in the majority of cases. The overall index design strategy should be to provide a good selection of indexes to the optimizer, and trust it to make the right decision. This reduces analysis time and gives good performance over a wide variety of situations.

The following are index design recommendations:
Do not always equate index usage with good performance, and the reverse. If using an index always produced the best performance, the optimizer's job would be very simple - always use any available index. In reality, incorrect choice of indexed retrieval can result in very bad performance. Therefore the optimizer's task is to select indexed retrieval where it will help performance, and avoid indexed retrieval where it will hurt performance.

Use Efficient Query Design

Some types of queries are inherently resource intensive. This is related to fundamental database and index issues common to most relational database management systems (RDBMSs), not specifically to SQL Server. They are not inefficient, because the optimizer will implement the queries in the most efficient fashion possible. However, they are resource intensive, and the set-oriented nature of SQL may make them appear inefficient. No degree of optimizer intelligence can eliminate the inherent resource cost of these constructs. They are intrinsically costly when compared to a more simple query. Although SQL Server will use the most optimal access plan, this is limited by what is fundamentally possible.

For example:
Various factors may necessitate the use of some of these query constructs. The impact of these will be lessened if the optimizer can restrict the result set before applying the resource intensive portion of the query. The following are some examples.

Resource-intensive:

   SELECT SUM(SALARY) FROM TABLE 

Less resource-intensive:

   SELECT SUM(SALARY) FROM TABLE WHERE
   ZIP='98052' 

Resource-intensive:

   SELECT * FROM TABLE WHERE
   LNAME=@VAR 

Less resource-intensive:

   SELECT * FROM TABLE
   WHERE LNAME=@VAR AND ZIP='98052' 

In the first example, the SUM operation cannot be accelerated with an index. Each row must be read and summed. Assuming that there is an index on the ZIP column, the optimizer will likely use this to initially restrict the result set before applying the SUM. This can be much faster.

In the second example, the local variable is not resolved until run time. However, the optimizer cannot defer the choice of access plan until run time; it must choose at compile time. Yet at compile time, when the access plan is built, the value of @VAR is not known and consequently cannot be used as input to index selection.

The illustrated technique for improvement involves restricting the result set with an AND clause. As an alternate technique, use a stored procedure, and pass the value for @VAR as a parameter to the stored procedure.

In some cases it is best to use a group of simple queries using temp tables to store intermediate results than to use a single very complex query.

Large result sets are costly on most RDBMSs. You should try not to return a large result set to the client for final data selection by browsing. It is much more efficient to restrict the size of the result set, allowing the database system to perform the function for which it was intended. This also reduces network I/O, and makes the application more amenable to deployment across slow remote communication links. It also improves concurrency-related performance as the application scales upward to more users.

Use Efficient Application Design

The role that application design plays in SQL Server performance cannot be overstated. Rather than picture the server in the dominant role, it is more accurate to picture the client as a controlling entity, and the server as a puppet of the client. SQL Server is totally under the command of the client regarding the type of queries, when they are submitted, and how results are processed. This in turn has a major effect on the type and duration of locks, amount of I/O and CPU load on the server, and hence whether performance is good or bad.

For this reason, it is important to make the correct decisions during the application design phase. However even if you face a performance problem using a turnkey application where changes to the client application seem impossible, this does not change the fundamental factors which affect performance - namely that the client plays a dominant role and many performance problems cannot be resolved without making client changes.

With a well-designed application, SQL Server is capable of supporting thousands of concurrent users. With a poorly-designed application, even the most powerful server platform can bog down with just a few users.

Using the following suggestions for client application design will provide good SQL Server performance:

Techniques to Analyze Slow Performance

It may be tempting to address a performance problem solely by system-level server performance tuning. For example, how much memory, the type of file system, the number and type of processors, and so forth. The experience of Microsoft SQL Server Support has shown that most performance problems cannot be resolved this way. They must be addressed by analyzing the application, the queries the application is submitting to the database, and how these queries interact with the database schema.

First, isolate the query or queries that are slow. Often it appears that an entire application is slow, when only a few of the SQL queries are slow. It is usually not possible to resolve a performance problem without breaking the problem down and isolating the slow queries. If you have a development tool that transparently generates SQL, use any available diagnostic or debug mode of this tool to capture the generated SQL. In many cases trace features are available, but they may not be openly documented. Contact the technical support for your application to determine if a trace feature exists for monitoring the SQL statements generated by the application.

For application development tools that use embedded SQL, this is much easier - the SQL is openly visible.

If your development tool or end-user application does not provide a trace feature, there are several alternatives:
After the slow query is isolated, do the following:
Check the program for I/O or CPU-bound behavior. It is often useful to determine if a query is I/O or CPU bound. This helps focus your improvement efforts on the true bottleneck. For example, if a query is CPU bound, adding more memory to SQL Server will probably not improve performance, because more memory only improves the cache hit ratio, which in this case, is already high.

How to Examine I/O vs. CPU-bound Query Behavior:

Conclusion

SQL Server is capable of very high performance on large databases. This is especially the case with SQL Server 6.0. To achieve this performance potential, you must use efficient database, index, query, and application design. These areas are the best candidates for obtaining significant performance improvement. Try to make each query as efficient as possible, so that when your application scales up to more users, the collective multiuser load is supportable. Study of the client application behavior, the queries submitted by the application, and experimentation with indexes using the guidelines in this document are strongly encouraged. A methodical approach in analyzing performance problems will often yield significant improvement for relatively little time investment.

Additional query words: 4.20 sql6 Windows NT optimization sqlfaqtop


Keywords          : kbother SSrvAdmin SQLFAQ SSrvWinNT 
Version           : winnt:4.2x,6.0,6.5
Platform          : winnt 
Issue type        : kbinfo 

Last Reviewed: June 8, 1999