INF: OLAP: How to Index Star/Snowflake Schema Data

ID: Q199132


The information in this article applies to:


SUMMARY

When preparing data to be moved into OLAP cubes, the indexing of the dimensions and fact tables in the base data is critically important to the performance of cube processing. This article is intended to give users of SQL Server OLAP Services an overview of this issue and a simple and effective starting point.


MORE INFORMATION

In MOLAP, ROLAP, or HOLAP, OLAP Services connects to the central data warehouse (which can be on SQL Server) like any other client application (through OLE DB) and sends a GROUP BY query to it. SQL Server parses the query and either sends the data back to OLAP Services (with MOLAP and HOLAP) or simply aggregates the data into new tables (with ROLAP). OLAP Services is just a client.

When a client application connects to SQL Server and performance is not what is expected, it is common practice to look at SQL Server and check how to optimize the query/schema for better performance. The same is true with OLAP Services. If performance is slow in retrieving data from SQL Server, something is probably wrong with the star/snowflake schema in SQL Server (or other heterogeneous data source), including schema design, indexing, key relationships, referential integrity, and insufficient data scrubbing. You should treat and optimize your schema as you would for any other client application.

The solution given below is only one possibility. As with all data projects, individual schemas and uses will determine the best solution for that project. This solution is meant as a starting place. The "OLAP and Data Warehouses" topic in the product documentation provides additional information regarding schemas, design of dimension tables, normalization, fact summarization, referential integrity, and indexes.

Before processing an OLAP Services cube, do the following:

  1. Declare a primary key in each dimension table.


  2. For star dimensions, declare foreign key (FK) relationships between each dimension table and the corresponding fact table. Additionally, for snowflaked dimensions, declare FK relationships between each secondary dimension table and the primary dimension table that it augments.


  3. Make sure that there is a primary key associated with the fact table(s), even if you have to use an IDENTITY field.


  4. Declare indexes on each of the primary keys in the dimension tables and in the fact table(s).


  5. Declare indexes on each of the foreign keys in the fact table(s).
    NOTE: The indexing strategy suggested in the product documentation under "OLAP and Data Warehouses" topic -- a single index on the combined FK columns in the fact table(s) -- may not be as efficient in terms of performance in most schemas as the index strategy suggested by this article (an index on each FK column). It is suggested that the method described in this article be used instead.


  6. Clean up your data to remove all NULLs for those data items that are being moved into OLAP Services.


Additional query words: oledb


Keywords          : 
Version           : winnt:6.5,7.0
Platform          : winnt 
Issue type        : kbhowto kbinfo 

Last Reviewed: June 9, 1999