SQL Server Best Practices Article
Published: June 7, 2007
Writers: Denny Lee, Nicholas Dristas, Carl Rabeler
Contributors: Kevin Cox, Craig Utley, Eric Jacobsen, Akshai Mirchandani
Technical Reviewer: Sasha (Alexander) Berger, Dave Wickert
Applies To: SQL Server 2005 SP2
Summary: Following the best practices described in this article will help you improve the performance of OLAP database processing.
On This Page
IntroductionDimension Processing Best Practices
Cube Processing Best Practices
Parallel Processing Best Practices
Troubleshooting Best Practices
Summary
Introduction
The recommendations in this white paper provide best practices for processing in Microsoft® SQL Server™ 2005 Analysis Services. These performance optimizations are based on real-world implementations with some of our top customers and partners who use SQL Server 2005 Analysis Services Enterprise Edition.This document presents Analysis Services processing best practices recommendations in the following sections:
-
Dimension Processing Best Practices
-
Cube Processing Best Practices
-
Parallel Processing Best Practices
-
Troubleshooting Processing Best Practices
Dimension Processing Best Practices
While dimension processing is quite fast in Analysis Services 2005, here are some best practices to help optimize dimension processing. For more information on the inner workings of Analysis Services dimension processing, see the "Processing Engine" section in Analysis Services 2005 Processing Architecture by T.K. Anand.Add indexes to the underlying table
In most cases, adding indexes to the underlying table improves dimension processing because the SQL statement that is generated is in the form of SELECT DISTINCT AttribA, AttribB FROM DimensionTable. Building good attribute / hierarchy relationships forces the Analysis Services engine to build multiple smaller DISTINCT queries instead of one large and very expensive SELECT DISTINCT AttribA, AttribB, ..., AttribN query. Therefore, additional indexes improve the performance of these DISTINCT queries.Create a separate table for dimension processing
If you experience slow dimension processing, creating a new table that is used only to process the dimension can improve processing speed. For example, in the case of a snowflake dimension where multiple tables are joined together to define your dimension, consider creating a separate new table specifically for processing the dimension. This is especially important if you have dimensions that are designed on top of disparate databases; this way you can avoid OPENROWSET processing queries.Use SQL views to implement query binding for dimensions
While query binding for dimensions does not exist within Analysis Services 2005, you can implement it by using a view (instead of tables) for your underlying dimension data source. That way, you can use hints or other SQL tuning techniques to optimize the SQL statement that accesses the dimension tables through your view.Be careful when you use the ProcessUpdate enumeration
If you use ProcessUpdate and dimension members are being changed or deleted, flexible aggregations will be dropped. These must be recreated by processing the indexes on the affected partitions and objects or by enabling lazy aggregations. Otherwise, there will not be any aggregations for users to query against; this results in slower query performance. Also, if rigid relationships have been defined and a change is detected, an error will be returned, requiring you to perform a full process.Use the ProcessAdd enumeration if you are only adding new dimension members
The ProcessAdd process enumeration is not available in the processing UI but it is available within XMLA. Introduced in Analysis Services 2005, it is essentially an optimized version of ProcessUpdate in cases where only new members are being added. To use ProcessAdd correctly to improve performance, restrict the dimension table to only the new rows added. For more information on this, including sample code, see the second set of code in Analysis Services 2005 Processing Architecture on the Microsoft Developers Network (MSDN). This processing type allows dimension processing to be much faster as it will never delete or update existing dimension members nor does it drop flexible aggregations.Upgrade to Analysis Services SP2 because it has dimension process caching
Analysis Services Service Pack (SP2) provides processing improvements through dimension processing caching. To take advantage of this feature, for larger systems add memory so that Analysis Services can use it. (It is not uncommon for Analysis Services to use 10 GB (gigabytes) of memory if it is available.) This prevents it from paging to disk.Performance improvements when using dimension process caching are typically seen when processing on servers with two or more processors. For example, the following table shows processing times for a 4-GB RAM, four-processor server that is processing 32 partitions in parallel. Both the Analysis Services and SQL Server databases are on the same server.
Service Pack |
Process time (hh:mm:ss.sss) |
---|---|
SP1 |
00:13:28.020 |
SP2 |
00:12:29.966 |
Cube Processing Best Practices
To ensure fast and effective cube processing, it is important to remember that cube processing is performed in Analysis Services by executing Analysis Services-generated SQL statements against the underlying relational database. Because of this, it is possible (when necessary) to tune the underlying SQL statements and/or update statistics. Adding indexes to the underlying tables may improve performance as well, but test this first as table scanning is often a faster approach for cube processing.Use integer keys if at all possible
When working with any version of Analysis Services, ensure that the key columns are integer. This is important for both SQL and non-SQL RDBMS systems because of the SQL statement generated by Analysis Services when it is processing a cube, measure group, or partition. The resulting query plan and execution of this SQL statement will be more efficient if the joins are between integer keys.Performance improvements are even more apparent with Analysis Services SP2 due to its dimension processing caching feature. These improvements also optimize in the case of two composite integer keys of sizes 1, 2, 4, and 8 bytes. While string keys provide limited improvement, this is not as evident as when using integer keys.
Following is the SQL query that is generated by Analysis Services when processing the [Internet Sales] measure group in the [Adventure Works DW] OLAP database.
SELECT [dbo_FactInternetSales].[CustomerKey] AS [dbo_FactInternetSalesCustomerKey0_0], [dbo_FactInternetSales].[PromotionKey] AS [dbo_FactInternetSalesPromotionKey0_1], [dbo_FactInternetSales].[SalesTerritoryKey] AS [dbo_FactInternetSalesSalesTerritoryKey0_2], [dbo_FactInternetSales].[OrderDateKey] AS [dbo_FactInternetSalesOrderDateKey0_3], [dbo_FactInternetSales].[ShipDateKey] AS [dbo_FactInternetSalesShipDateKey0_4], [dbo_FactInternetSales].[DueDateKey] AS [dbo_FactInternetSalesDueDateKey0_5], [dbo_FactInternetSales].[ProductKey] AS [dbo_FactInternetSalesProductKey0_6], [dbo_FactInternetSales].[CurrencyKey] AS [dbo_FactInternetSalesCurrencyKey0_7] FROM ( SELECT [dbo].[FactInternetSales].[ProductKey], [dbo].[FactInternetSales].[OrderDateKey], [dbo].[FactInternetSales].[DueDateKey], [dbo].[FactInternetSales].[ShipDateKey], [dbo].[FactInternetSales].[CustomerKey], [dbo].[FactInternetSales].[PromotionKey], [dbo].[FactInternetSales].[CurrencyKey], [dbo].[FactInternetSales].[SalesTerritoryKey], [dbo].[FactInternetSales].[SalesOrderNumber], [dbo].[FactInternetSales].[SalesOrderLineNumber], [dbo].[FactInternetSales].[RevisionNumber], [dbo].[FactInternetSales].[OrderQuantity], [dbo].[FactInternetSales].[UnitPrice], [dbo].[FactInternetSales].[ExtendedAmount], [dbo].[FactInternetSales].[UnitPriceDiscountPct], [dbo].[FactInternetSales].[DiscountAmount], [dbo].[FactInternetSales].[ProductStandardCost], [dbo].[FactInternetSales].[TotalProductCost], [dbo].[FactInternetSales].[SalesAmount], [dbo].[FactInternetSales].[TaxAmt], [dbo].[FactInternetSales].[Freight], [dbo].[FactInternetSales].[CarrierTrackingNumber], [dbo].[FactInternetSales].[CustomerPONumber] FROM [dbo].[FactInternetSales] WHERE OrderDateKey >= '915' AND OrderDateKey <= '1280' ) AS [dbo_FactInternetSales] ORDER BY [dbo_FactInternetSales].[CustomerKey] ASCThe SQL statement, and particularly the query plan, that are generated by Analysis Services are different when the underlying fact table for the [Internet Sales] Measure Group uses varchar keys as shown in Figure 1.
Figure 1 Estimated query plan between varchar-key vs. integer-key fact tables
As you can see from the estimated query plan in Figure 1, the varchar-key fact table (Query 2) has a higher cost associated with the table scan than does the integer-key
fact table (Query 1). Specifically, there are higher subtree costs
associated with both the Sort and Table Scan tasks for the varchar-key fact table.
Task |
Total Subtree Cost |
Total Subtree Cost |
---|---|---|
|
Query 1 (integer keys)
|
Query 2 (varchar keys)
|
Sort |
3.19582 |
3.375974 |
Table Scan |
0.9134235 |
1.247498 |
Use query binding to optimize processing
You can improve cube processing performance by tuning the underlying SQL statements executed. For this reason, it is important to use query binding instead of table binding when you build your measure group partitions (see the SQL statements in Figures 2 and 3). Looking at the Adventure Works OLAP cube in Microsoft Visual Studio®, notice the Partitions tab as shown in Figure 2.
Figure 2 Adventure Works Partitions tab
Clicking on the partition source opens the Partition Source dialog box as shown in Figure 3. As you can see, by using the query binding binding type, the data source is a developer-created SQL statement.
Figure 3 Partition Source dialog box
By using this binding type option, you can optimize the SQL query that is sent to the relational data source. Partition measure groups if you have a lot of data
It is generally a good idea to map the measure group partitions to either separate SQL table partitions by using partition keys or onto a separate SQL table. Use partitioning wherever possible; this limits the amount of reprocessing that needs to be done based on your extraction, transformation, and loading (ETL). For more information, see "Partition Design Best Practices" in OLAP Design Best Practices for Analysis Services 2005 on Microsoft TechNet.Use ProcessData and ProcessIndex instead of ProcessFull
The ProcessFull enumeration type executes the ProcessData and ProcessIndexes jobs. For more information on these process enumeration types, see the "Dimension-Processing Commands" and "Partition-Processing Commands" sections in the SQL Server Analysis Services 2005 Performance Guide.Breakdown of processing enumerations
While testing, the Microsoft SQL Server Development Customer Advisory Team (SQLCAT) noticed that the processing against the relational database completes faster and Analysis Services consumes fewer memory resources when the ProcessData and ProcessIndexes enumerations are used as compared to using ProcessFull. Memory considerations are especially important on 32-bit systems because of their lower memory consumption. As you can see from the following table, there are noticeable improvements in processing time.
Process enumeration |
Processing time |
---|---|
Two-step process |
00:12:23 |
ProcessData
|
00:10:18
|
ProcessIndexes
|
00:02:05
|
ProcessFull |
00:13:34 |
<Batch xmlns="..."> <Parallel> <Process xmlns:xsd="..." xmlns:xsi="..." xmlns:ddl2="..." xmlns:ddl2_2="..."> <Object> <DatabaseID>Adventure Works DW</DatabaseID> <CubeID>Adventure Works DW</CubeID> </Object> <Type>ProcessData</Type> --- or ProcessIndexes <WriteBackTableCreation>UseExisting</WriteBackTableCreation> </Process> </Parallel> </Batch>For more information on the available processing enumerations, see SQL Server Books Online, particularly the Type Element topic.
CPU usage patterns are different between the enumerations
The apparent reason for this difference is that the CPU usage patterns between the two different actions are quite different. See the performance monitor output in the next three figures (the % Processor Time is highlighted in the figures). As you can see, not only is there faster processing time by using ProcessData and ProcessIndexes, but there are also differences in CPU utilization as well (ProcessFull: 64.021, ProcessData: 47.847, ProcessIndexes: 54.353).
Figure 4 CPU usage pattern: ProcessFull, average 64.021
Figure 5 CPU usage pattern: ProcessData, average 47.847
Figure 6 CPU usage pattern: ProcessIndexes, average 54.353
As you can see from Figures 5 and 6, the ProcessData and ProcessIndexes enumeration behave differently from ProcessFull. For some customers, this difference resulted in less processing time against their relational data source. To investigate the scalability of your system, process using ProcessData with one partition and MaxParallel=1. Then measure two partitions with MaxParallel=2, ..., and so forth. Measure and compare the impact (memory utilized, CPU usage, disk I/O, and so on) to determine optimal performance.
Parallel Processing Best Practices
Here are a number of techniques for optimizing parallel processing.Do not always use the UI default for parallel processing
By default, when Analysis Services processes the selected cube, measure group, or partitions, the server decides how to process the object(s) in parallel. However, letting the server decide how many threads to process in parallel may not always be optimal. For example, for the [Adventure Works DW] sample OLAP database on a 4-GB RAM, four-processor server running Analysis Services SP2, the comparable processing times in seconds are:
Parallel processing setting |
Process time (ss.sss) |
---|---|
Letting the server decide |
50.990 |
Manually setting parallel to four tasks |
47.347 |
Figure 7 Analysis Services Change Settings dialog box
Set the appropriate values for parallel processing
Generally, the appropriate number of parallel tasks is 1.5 – 2 times the number of CPUs; exceeding this value may cause competition for resources. Always test the system to see how well parallelism will work for you. Since you have limited resources, a plausible option could also be to use less parallelism instead of more.The ascmd.exe command-line utility is a handy tool for testing performance characteristics to determine the appropriate values for parallel processing. This utility is similar to the sqlcmd.exe utility that you use to execute SQL statements in SQL Server. The ascmd.exe utility executes XMLA commands against an Analysis Services server.
The ascmd.exe utility is included in SQL Server Samples, available at SQL Server 2005 Samples and Sample Databases on the Microsoft Download Center. After you install the samples, you can find ascmd.exe in the c:\Program Files\Microsoft SQL Server\90\Samples\Analysis Services\Administrator\ascmd folder. For more information on the ascmd command-line utility, see the Readme For Ascmd Command-line Utility Sample on MSDN.
Following is an example XMLA script to process the [Adventure Works DW] cube; XMLA name spaces have been removed for clarity.
<Batch xmlns="..."> <Parallel MaxParallel="8"> <Process xmlns:xsd="..." xmlns:xsi="..." xmlns:ddl2="..." xmlns:ddl2_2="..."> <Object> <DatabaseID>Adventure Works DW</DatabaseID> <CubeID>Adventure Works DW</CubeID> </Object> <Type>ProcessFull</Type> <WriteBackTableCreation>UseExisting</WriteBackTableCreation> </Process> </Parallel> </Batch>By using ascmd.exe and XMLA, you can adjust the XMLA MaxParallel parameter to try out different values. (The XMLA example set this to a value of 8; the previous tip, Do not always let the UI default for parallel processing, shows how to set this in the UI.) This setting specifies the maximum number of threads on which to run commands in parallel as in the Change Settings dialog box (accessed through the processing UI) in Figure 7.
Using the ascmd.exe utility and four different XMLA scripts, we were able to quickly automate and test four different parallel processing settings. Following are the results of this test. As you can see, for the particular machine configuration used in this test, it was beneficial to set the parallelism to 8 instead of letting the server decide.
MaxParallel settings |
SQL/OLAP on same server |
SQL/OLAP on different servers |
---|---|---|
Let the server decide (default) |
00:12:29.966 |
00:07:43.060 |
Parallel = 4 |
00:17:05.000 |
00:08:07.953 |
Parallel = 6 |
00:12:57.023 |
00:07:37.023 |
Parallel = 8 |
00:11:28.983 |
00:07:31.933 |
Use the XMLA <Parallel> nodes to group processing tasks
In the processing tasks defined within your XMLA, create logical <Parallel></Parallel> sets to group processing tasks together so that the server uses the maximum resources. At the same time, get into the habit of processing the individual partitions (versus the entire measure group) using the <PartitionID></PartitionID> node to process less data at any one time.Within the <Batch></Batch> element, if you set the Transaction attribute to true, all commands within the batch will be considered as a single transaction. If this attribute is set to false, every single processing command will be committed upon successful completion. This latter option is beneficial if you are processing different measure groups and/or objects in parallel. As you can see in the following example, the XMLA is processing the [Fact Sales Summary] measure group and its individual partitions in parallel.
<Batch xmlns="..." Transaction="true"> <Parallel> <Process xmlns:xsd="..." xmlns:xsi="..." ...=""> <Object> <DatabaseID>Adventure Works DW</DatabaseID> <CubeID>Adventure Works DW</CubeID> <MeasureGroupID>Fact Sales Summary</MeasureGroupID> <PartitionID>Total_Sales_2001</PartitionID> </Object> <Type>ProcessFull</Type> <WriteBackTableCreation>UseExisting</WriteBackTableCreation> </Process> <Process xmlns:xsd="..." xmlns:xsi="..." ...=""> <Object> <DatabaseID>Adventure Works DW</DatabaseID> <CubeID>Adventure Works DW</CubeID> <MeasureGroupID>Fact Sales Summary</MeasureGroupID> <PartitionID>Total_Sales_2002</PartitionID> </Object> <Type>ProcessFull</Type> <WriteBackTableCreation>UseExisting</WriteBackTableCreation> </Process> <Process xmlns:xsd="..." xmlns:xsi="..." ...=""> <Object> <DatabaseID>Adventure Works DW</DatabaseID> <CubeID>Adventure Works DW</CubeID> <MeasureGroupID>Fact Sales Summary</MeasureGroupID> <PartitionID>Total_Sales_2003</PartitionID> </Object> <Type>ProcessFull</Type> <WriteBackTableCreation>UseExisting</WriteBackTableCreation> </Process> <Process xmlns:xsd="..." xmlns:xsi="..." ...=""> <Object> <DatabaseID>Adventure Works DW</DatabaseID> <CubeID>Adventure Works DW</CubeID> <MeasureGroupID>Fact Sales Summary</MeasureGroupID> <PartitionID>Total_Sales_2004</PartitionID> </Object> <Type>ProcessFull</Type> <WriteBackTableCreation>UseExisting</WriteBackTableCreation> </Process> </Parallel> </Batch>For the sake of clarity, the xml namespaces in this example were removed. Note that the Transaction attribute has been set to true, meaning that the processing of the four [Fact Sales Summary] partitions will be committed as a single transaction. On a single-processor desktop computer, you may notice that the processing of this measure group will take 11s (seconds) when processing as a single measure group and 6s when processing as four individual partitions in parallel.
Troubleshooting Best Practices
These tips for troubleshooting processing will help you to understand your processing bottlenecks.Profile your SQL statements to tune them
To help with tuning, you can use SQL Profiler to capture the SQL statements issued by Analysis Services so that you can perform your regular SQL tuning efforts (such as updating statistics, adding indexes, and so on). For example, processing the [Sales Summary] measure group in the [AdventureWorksDW] OLAP DB provides the SQL Profiler trace in Figure 8.
Figure 8 SQL Server Profiler trace for AdventureWorks processing
By selecting one of many the SQL statements issued, you can then run a
query plan execution. In the case of the SQL statement selected in
Figure 8, the total subtree cost was 2.285195.
Figure 9 Total subtree cost of 2.285195
In this particular case, the Sales Summary measure group is
partitioned by the Order Date; the underlying SQL has a WHERE clause
using the OrderDateKey. Adding a clustered index on the
FactResellerSales table, OrderDateKey column would improve the
performance of cube processing. The new query plan for the same SQL
statement now has a total subtree cost of 1.142899 (Figure 10).
Figure 10 Total subtree cost 1.142899 with clustered indexes
Create a system-wide trace file
If you are having problems determining what is causing your processing bottlenecks, keep a system-wide processing trace file. Use it to determine which processing queries are running slower (or faster) than expected, as well as if they were to be executed at all. You can obtain this trace file by using SQL Profiler to record the information as described in the previous best practice, Profile your SQL statements to tune them.Another handy way to do this is to run the following XMLA script (which is processing the Adventure Works cube), which outputs the trace to the c:\OLAP_Processing.trc file.
<Batch xmlns="http://schemas.microsoft.com/analysisservices/2003/engine" xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"> <Create mlns="http://schemas.microsoft.com/analysisservices/2003/engine"> <ObjectDefinition> <Trace> <ID>Analysis Services Processing Trace</ID> <Name>Analysis Services Processing Trace</Name> <LogFileName>C:\OLAP_Processing.trc</LogFileName> <LogFileAppend>1</LogFileAppend> <AutoRestart>1</AutoRestart> <LogFileSize>50000</LogFileSize> <LogFileRollover>1</LogFileRollover> <Events> <Event> <EventID>5</EventID> <Columns> <ColumnID>7</ColumnID> <ColumnID>15</ColumnID> <ColumnID>39</ColumnID> <ColumnID>8</ColumnID> <ColumnID>12</ColumnID> <ColumnID>28</ColumnID> <ColumnID>32</ColumnID> <ColumnID>40</ColumnID> <ColumnID>1</ColumnID> <ColumnID>13</ColumnID> <ColumnID>25</ColumnID> <ColumnID>33</ColumnID> <ColumnID>41</ColumnID> <ColumnID>2</ColumnID> <ColumnID>14</ColumnID> <ColumnID>42</ColumnID> <ColumnID>3</ColumnID> <ColumnID>11</ColumnID> <ColumnID>43</ColumnID> </Columns> </Event> <Event> <EventID>6</EventID> <Columns> <ColumnID>7</ColumnID> <ColumnID>15</ColumnID> <ColumnID>23</ColumnID> <ColumnID>39</ColumnID> <ColumnID>8</ColumnID> <ColumnID>24</ColumnID> <ColumnID>32</ColumnID> <ColumnID>40</ColumnID> <ColumnID>1</ColumnID> <ColumnID>9</ColumnID> <ColumnID>25</ColumnID> <ColumnID>33</ColumnID> <ColumnID>41</ColumnID> <ColumnID>2</ColumnID> <ColumnID>6</ColumnID> <ColumnID>10</ColumnID> <ColumnID>14</ColumnID> <ColumnID>22</ColumnID> <ColumnID>42</ColumnID> <ColumnID>3</ColumnID> <ColumnID>11</ColumnID> <ColumnID>43</ColumnID> <ColumnID>4</ColumnID> <ColumnID>12</ColumnID> <ColumnID>28</ColumnID> <ColumnID>5</ColumnID> <ColumnID>13</ColumnID> </Columns> </Event> <Event> <EventID>8</EventID> <Columns> <ColumnID>7</ColumnID> <ColumnID>15</ColumnID> <ColumnID>39</ColumnID> <ColumnID>8</ColumnID> <ColumnID>24</ColumnID> <ColumnID>1</ColumnID> <ColumnID>5</ColumnID> <ColumnID>9</ColumnID> <ColumnID>13</ColumnID> <ColumnID>25</ColumnID> <ColumnID>41</ColumnID> <ColumnID>2</ColumnID> <ColumnID>10</ColumnID> <ColumnID>14</ColumnID> <ColumnID>22</ColumnID> <ColumnID>42</ColumnID> <ColumnID>3</ColumnID> <ColumnID>11</ColumnID> <ColumnID>43</ColumnID> <ColumnID>4</ColumnID> <ColumnID>12</ColumnID> <ColumnID>28</ColumnID> </Columns> </Event> </Events> </Trace> </ObjectDefinition> </Create> <Parallel> <Process xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ddl2="http://schemas.microsoft.com/analysisservices/2003/engine/2" xmlns:ddl2_2="http://schemas.microsoft.com/analysisservices/2003/engine/2/2"> <Object> <DatabaseID>Adventure Works DW</DatabaseID> <CubeID>Adventure Works DW</CubeID> </Object> <Type>ProcessFull</Type> <WriteBackTableCreation>UseExisting</WriteBackTableCreation> </Process> </Parallel> </Batch> <Delete xmlns="http://schemas.microsoft.com/analysisservices/2003/engine" xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"> <Object> <TraceID>Analysis Services Processing Trace</TraceID> </Object> </Delete>
Upgrade to Analysis Services SP2 for NUMA support
A big improvement to processing with SP2 is Analysis Services native support of non-uniform memory access (NUMA) hardware. For more information on NUMA, see Understanding Non-uniform Memory Access in SQL Server Books Online.Prevent long-running queries from interfering with processing
Long-running queries that are executed concurrently with processing can prevent processing from completing. In order to commit the processing activity, Analysis Services takes an exclusive lock on the database. But, a long running-query will block this lock request and processing must wait for the queries to finish before processing can complete. If you cannot create an exclusive processing window, consider architecting a querying/processing Analysis Services server architecture as in Figure 11. The Analysis Services processing server exclusively processes data; upon completion, it synchronizes with the Analysis Services querying server, which solely renders MDX queries.
Figure 11 Querying / Processing AS server architecture
Nenhum comentário:
Postar um comentário