Cursor Optimization Tips

*****

Try to avoid using SQL Server cursors, whenever possible.
SQL Server cursors can results in some performance degradation in comparison with select statements. Try to use correlated subquery or derived tables, or temp table if you need to perform row-by-row operations.

*****

Do not forget to close SQL Server cursor when its result set is not needed.
To close SQL Server cursor, you can use CLOSE {cursor_name} command. This command releases the cursor result set and frees any cursor locks held on the rows on which the cursor is positioned.

*****

Do not forget to deallocate SQL Server cursor when the data structures comprising the cursor are not needed.
To deallocate SQL Server cursor, you can use DEALLOCATE {cursor_name} command. This command removes a cursor reference and releases the data structures comprising the cursor.

*****

Try to reduce the number of records to process in the cursor.
To reduce the cursor result set, use the WHERE clause in the cursor's select statement. It can increase cursor performance and reduce SQL Server overhead.

*****

Try to reduce the number of columns to process in the cursor.
Include in the cursor's select statement only necessary columns. It will reduce the cursor result set. So, the cursor will use fewer resources. It can increase cursor performance and reduce SQL Server overhead.

*****

Use READ ONLY cursors, whenever possible, instead of updatable cursors.
Because using cursors can reduce concurrency and lead to unnecessary locking, try to use READ ONLY cursors, if you do not need to update cursor result set.

*****

Index Optimization Tips

*****

Consider creating index on column(s) frequently used in the WHERE, ORDER BY, and GROUP BY clauses.
These column(s) are best candidates for index creating. You should analyze your queries very attentively to avoid creating not useful indexes.

*****

Keep your indexes as narrow as possible.
Because each index take up disk space try to minimize the index key's size to avoid using superfluous disk space. This reduces the number of reads required to read the index and boost overall index performance.

*****

Drop indexes that are not used.
Because each index take up disk space and slow the adding, deleting, and updating of rows, you should drop indexes that are not used. You can use Index Wizard to identify indexes that are not used in your queries.

*****

Try to create indexes on columns that have integer values rather than character values.
Because the integer values usually have less size then the characters values size (the size of the int data type is 4 bytes, the size of the bigint data type is 8 bytes), you can reduce the number of index pages which are used to store the index keys. This reduces the number of reads required to read the index and boost overall index performance.

*****

Limit the number of indexes, if your application updates data very frequently.
Because each index take up disk space and slow the adding, deleting, and updating of rows, you should create new indexes only after analyze the uses of the data, the types and frequencies of queries performed, and how your queries will use the new indexes. In many cases, the speed advantages of creating the new indexes outweigh the disadvantages of additional space used and slowly rows modification. However, avoid using redundant indexes, create them only when it is necessary. For read-only table, the number of indexes can be increased.

*****

Check that index you tried to create does not already exist.
Keep in mind that when you create primary key constraint or unique key constraints SQL Server automatically creates index on the column(s) participate in these constraints. If you specify another index name, you can create the indexes on the same column(s) again and again.

*****

Create clustered index instead of nonclustered to increase performance of the queries that return a range of values and for the queries that contain the GROUP BY or ORDER BY clauses and return the sort results.
Because every table can have only one clustered index, you should choose the column(s) for this index very carefully. Try to analyze all your queries, choose most frequently used queries and include into the clustered index only those column(s), which provide the most performance benefits from the clustered index creation.

*****

Create nonclustered indexes to increase performance of the queries that return few rows and where the index has good selectivity.
In comparison with a clustered index, which can be only one for each table, each table can have as many as 249 nonclustered indexes. However, you should consider nonclustered index creation as carefully as the clustered index, because each index take up disk space and drag on data modification.

*****

Create clustered index on column(s) that is not updated very frequently.
Because the leaf node of a nonclustered index contains a clustered index key if the table has clustered index, then every time that a column used for a clustered index is modified, all of the nonclustered indexes must also be modified.

*****

Create clustered index based on a single column that is as narrow as possibly.
Because nonclustered indexes contain a clustered index key within their leaf nodes and nonclustered indexes use the clustered index to locate data rows, creating clustered index based on a single column that is as narrow as possibly will reduce not only the size of the clustered index, but all nonclustered indexes on the table also.

*****

Avoid creating a clustered index based on an incrementing key.
For example, if a table has surrogate integer primary key declared as IDENTITY and the clustered index was created on this column, then every time data is inserted into this table, the rows will be added to the end of the table. When many rows will be added a "hot spot" can occur. A "hot spot" occurs when many queries try to read or write data in the same area at the same time. A "hot spot" results in I/O bottleneck.
Note. By default, SQL Server creates clustered index for the primary key constraint. So, in this case, you should explicitly specify NONCLUSTERED keyword to indicate that a nonclustered index is created for the primary key constraint.

*****

Create a clustered index for each table.
If you create a table without clustered index, the data rows will not be stored in any particular order. This structure is called a heap. Every time data is inserted into this table, the row will be added to the end of the table. When many rows will be added a "hot spot" can occur. To avoid "hot spot" and improve concurrency, you should create a clustered index for each table.

*****

Don't create index on column(s) which values has low selectivity.
For example, don't create an index for columns with many duplicate values, such as "Sex" column (which has only "Male" and "Female" values), because in this case the disadvantages of additional space used and slowly rows modification outweigh the speed advantages of creating a new index.

*****

If you create a composite (multi-column) index, try to order the columns in the key as to enhance selectivity, with the most selective columns to the leftmost of the key.
The order of the columns in a composite (multi-column) index is very important. This can increase the chance the index will be used.

*****

If you create a composite (multi-column) index, try to order the columns in the key so that the WHERE clauses of the frequently used queries match the column(s) that are leftmost in the index.
The order of the columns in a composite (multi-column) index is very important. The index will be used to evaluate a query only if the leftmost index key's column are specified in the WHERE clause of the query. For example, if you create composite index such as "Name, Age", then the query with the WHERE clause such as "WHERE Name = 'Alex'" will use the index, but the query with the WHERE clause such as "WHERE Age = 28" will not use the index.

*****

If you need to join several tables very frequently, consider creating index on the joined columns.
This can significantly improve performance of the queries against the joined tables.

*****

Consider creating a surrogate integer primary key (identity, for example).
Every table must have a primary key (a unique identifier for a row within a database table). A surrogate primary key is a field that has a unique value but has no actual meaning to the record itself, so users should never see or change a surrogate primary key. Some developers use surrogate primary keys, others use data fields themselves as the primary key. If a primary key consists of many data fields and has a big size, consider creating a surrogate integer primary key. This can improve performance of your queries.

*****

Consider creating the indexes on all the columns, which referenced in most frequently used queries in the WHERE clause which contains the OR operator.
If the WHERE clause in the query contains an OR operator and if any of the referenced columns in the OR clause are not indexed, then the table or clustered index scan will be made. In this case, creating the indexes on all such columns can significantly improve your queries performance.

*****

If your application will perform the same query over and over on the same table, consider creating a covering index including columns from this query.
A covering index is an index, which includes all of the columns referenced in the query. So the creating covering index can improve performance because all the data for the query is contained within the index itself and only the index pages, not the data pages, will be used to retrieve the data. Covering indexes can bring a lot of performance to a query, because it can save a huge amount of I/O operations.

*****

Use the DBCC DBREINDEX statement to rebuild all the indexes on all the tables in your database periodically (for example, one time per week at Sunday) to reduce fragmentation.
Because fragmented data can cause SQL Server to perform unnecessary data reads and the queries performance against the heavy fragmented table can be very bad, you should periodically rebuild all indexes to reduce fragmentation. Try to schedule the DBCC DBREINDEX statement during CPU idle time and slow production periods.

*****

Use the DBCC INDEXDEFRAG statement to defragment clustered and secondary indexes of the specified table or view.
The DBCC INDEXDEFRAG statement is a new SQL Server 2000 command, which was not supported in the previous versions. Unlike DBCC DBREINDEX, DBCC INDEXDEFRAG does not hold locks long term and thus will not block running queries or updates. So, try to use the DBCC INDEXDEFRAG command instead of DBCC DBREINDEX, whenever possible.

*****

Consider using the SORT_IN_TEMPDB option when you create an index and when tempdb is on a different set of disks than the user database.
The SORT_IN_TEMPDB option is a new SQL Server 2000 feature, which was not supported in the previous versions. When you create an index with the SORT_IN_TEMPDB option, SQL Server uses the tempdb database, instead of the current database, to sort data during the index creation. Using this option can reduce the time it takes to create an index, but increases the amount of disk space used to create an index.

*****

Use the SQL Server Profiler Create Trace Wizard with "Identify Scans of Large Tables" trace to determine which tables in your database may need indexes.
This trace will show which tables are being scanned by queries instead of using an index.

*****

Transact-SQL Optimization Tips

*****

Try to restrict the queries result set by using the WHERE clause.
This can results in good performance benefits, because SQL Server will return to client only particular rows, not all rows from the table(s). This can reduce network traffic and boost the overall performance of the query.

*****

Try to restrict the queries result set by returning only the particular columns from the table, not all table's columns.
This can results in good performance benefits, because SQL Server will return to client only particular columns, not all table's columns. This can reduce network traffic and boost the overall performance of the query.

*****

Use views and stored procedures instead of heavy-duty queries.
This can reduce network traffic, because your client will send to server only stored procedure or view name (perhaps with some parameters) instead of large heavy-duty queries text. This can be used to facilitate permission management also, because you can restrict user access to table columns they should not see.

*****

Try to avoid using SQL Server cursors, whenever possible.
SQL Server cursors can result in some performance degradation in comparison with select statements. Try to use correlated subquery or derived tables, if you need to perform row-by-row operations.

*****

If you need to return the total table's row count, you can use alternative way instead of SELECT COUNT(*) statement.
Because SELECT COUNT(*) statement make a full table scan to return the total table's row count, it can take very many time for the large table. There is another way to determine the total row count in a table. You can use sysindexes system table, in this case. There is ROWS column in the sysindexes table. This column contains the total row count for each table in your database. So, you can use the following select statement instead of SELECT COUNT(*): SELECT rows FROM sysindexes WHERE id = OBJECT_ID('table_name') AND indid < 2 So, you can improve the speed of such queries in several times.
See this article for more details:
Alternative way to get the table's row count.

*****

Try to use constraints instead of triggers, whenever possible.
Constraints are much more efficient than triggers and can boost performance. So, you should use constraints instead of triggers, whenever possible.

*****

Use table variables instead of temporary tables.
Table variables require less locking and logging resources than temporary tables, so table variables should be used whenever possible. The table variables are available in SQL Server 2000 only.

*****

Try to avoid the HAVING clause, whenever possible.
The HAVING clause is used to restrict the result set returned by the GROUP BY clause. When you use GROUP BY with the HAVING clause, the GROUP BY clause divides the rows into sets of grouped rows and aggregates their values, and then the HAVING clause eliminates undesired aggregated groups. In many cases, you can write your select statement so, that it will contain only WHERE and GROUP BY clauses without HAVING clause. This can improve the performance of your query.

*****

Try to avoid using the DISTINCT clause, whenever possible.
Because using the DISTINCT clause will result in some performance degradation, you should use this clause only when it is necessary.

*****

Include SET NOCOUNT ON statement into your stored procedures to stop the message indicating the number of rows affected by a T-SQL statement.
This can reduce network traffic, because your client will not receive the message indicating the number of rows affected by a T-SQL statement.

*****

Use the select statements with TOP keyword or the SET ROWCOUNT statement, if you need to return only the first n rows.
This can improve performance of your queries, because the smaller result set will be returned. This can also reduce the traffic between the server and the clients.

*****

Use the FAST number_rows table hint if you need to quickly return 'number_rows' rows.
You can quickly get the n rows and can work with them, when the query continues execution and produces its full result set.

*****

Try to use UNION ALL statement instead of UNION, whenever possible.
The UNION ALL statement is much faster than UNION, because UNION ALL statement does not look for duplicate rows, and UNION statement does look for duplicate rows, whether or not they exist.

*****

Do not use optimizer hints in your queries.
Because SQL Server query optimizer is very clever, it is very unlikely that you can optimize your query by using optimizer hints, more often, this will hurt performance.

*****

Miscellaneous Optimization Tips

*****

Try to perform backup at the local hard disk first, and copy backup file(s) to the tape later.
When you perform backup, some SQL Server commands cannot be made, for example: during backup you cannot run ALTER DATABASE statement with either the ADD FILE or REMOVE FILE options, you cannot shrink database, you cannot run CREATE INDEX statement, you cannot make SELECT INTO, bulk load and so on. So, to improve backup performance, you can perform backup at the local hard disk first, and then copy backup file(s) to the tape.

*****

Use nonlogged bulk copy whenever possible.
The nonlogged bulk copy is much faster than logged one, but to use it you must provide all the following conditions:
1. The database option 'select into/bulkcopy' is set to true.
2. The target table is not being replicated.
3. The TABLOCK hint is specified.
4. The target table has no indexes, or if the table has indexes, it is empty when the bulk copy starts.

*****

Use native mode bulk copy whenever possible.
This can improve performance in comparison with the character mode.

*****

Try to use BULK INSERT command instead of bcp or DTS to load data into SQL Server.
The BULK INSERT command is much faster than bcp or the data pump to perform text file import operations, however, the BULK INSERT statement cannot bulk copy data from SQL Server to a data file.

*****

Use the bcp utility instead of DTS when you need to export data from the SQL Server table into a text file.
The bcp utility is much faster than DTS, so try to use it whenever possible.

*****

Keep transactions as short as possible.
This can be used to prevent deadlocks.

*****

Consider the horizontally partitioning the very large tables into the current and archives versions.
This can improve performance of your select statements.

*****

Normalize your database's tables to the third normal form.
A table is in third normal form (3NF) if it is in second normal form (2NF) and if it does not contain transitive dependencies. In most cases, you should normalize your tables to the third normal form.

*****

Consider the denormalization of your database's tables from the forth or fifth normal forms to the third normal form.
Normalization to the forth and fifth normal forms can results in some performance degradation, so it can be necessary to denormalize your database's tables to prevent performance degradation.

*****

Create the table's columns as narrow as possible.
So, you can reduce the table's size, this can improve performance of your queries and some maintenance tasks (such as backup, restore and so on).

*****

Use varchar/nvarchar columns instead of text/ntext columns whenever possible.
Because SQL Server stores text/ntext columns on the Text/Image pages separately from the other data, stored on the Data pages, then it can take more time to get the text/ntext values.

*****

Use char/varchar columns instead of nchar/nvarchar if you do not need to store unicode data.
So, you can reduce the table's size, this can improve performance of your queries and some maintenance tasks (such as backup, restore and so on).

*****

Use char/nchar data type instead of varchar/nvarchar if the column's size will be 4 bytes or less.
The char data type is a fixed-length data type and varchar data type is variable-length data type. So, by using char data type, you can increase the probability of in-place update instead of delete/insert or deferred update. The in-place update is the most effective method of modification, when it is used the data changed on its physical place. When the row's size is changed, the delete/insert modification method can be used. This results in some performance degradation.

*****

If you need to delete all tables rows, consider using TRUNCATE TABLE instead of DELETE command.
Using the TRUNCATE TABLE is much fast way to delete all table's rows, because it removes all rows from a table without logging the individual row deletes.

*****

You can increase the speed of sorting operation up to 20 percent, if you will use Binary sort order instead of Dictionary sort order, which is used by default.
Binary sort order is the simplest and fastest sort order, but it is used not often, because binary sort is not case-insensitive and it is based on the numeric values (from 0 through 255) of the characters in the installed character set.

*****

Don't use Enterprise Manager to access remote servers over a slow link or to maintain very large databases.
Because using Enterprise Manager is very resource expensive, use stored procedures and T-SQL statements, in this case.

*****

Try to install SQL Server on a standalone server box, not on a primary or backup domain controller.
Because domain controllers have extra overhead, you should install SQL Server on a standalone server box to dedicate all of the server's power to SQL Server.

*****

Stored Procedures Optimization Tips

*****

Use stored procedures instead of heavy-duty queries.
This can reduce network traffic, because your client will send to server only stored procedure name (perhaps with some parameters) instead of large heavy-duty queries text. Stored procedures can be used to enhance security and conceal underlying data objects also. For example, you can give the users permission to execute the stored procedure to work with the restricted set of the columns and data.

*****

Include the SET NOCOUNT ON statement into your stored procedures to stop the message indicating the number of rows affected by a Transact-SQL statement.
This can reduce network traffic, because your client will not receive the message indicating the number of rows affected by a Transact-SQL statement.

*****

Call stored procedure using its fully qualified name.
The complete name of an object consists of four identifiers: the server name, database name, owner name, and object name. An object name that specifies all four parts is known as a fully qualified name. Using fully qualified names eliminates any confusion about which stored procedure you want to run and can boost performance because SQL Server has a better chance to reuse the stored procedures execution plans if they were executed using fully qualified names.

*****

Consider returning the integer value as an RETURN statement instead of an integer value as part of a recordset.
The RETURN statement exits unconditionally from a stored procedure, so the statements following RETURN are not executed. Though the RETURN statement is generally used for error checking, you can use this statement to return an integer value for any other reason. Using RETURN statement can boost performance because SQL Server will not create a recordset.

*****

Don't use the prefix "sp_" in the stored procedure name if you need to create a stored procedure to run in a database other than the master database.
The prefix "sp_" is used in the system stored procedures names. Microsoft does not recommend to use the prefix "sp_" in the user-created stored procedure name, because SQL Server always looks for a stored procedure beginning with "sp_" in the following order: the master database, the stored procedure based on the fully qualified name provided, the stored procedure using dbo as the owner, if one is not specified. So, when you have the stored procedure with the prefix "sp_" in the database other than master, the master database is always checked first, and if the user-created stored procedure has the same name as a system stored procedure, the user-created stored procedure will never be executed.

*****

Use the sp_executesql stored procedure instead of the EXECUTE statement.
The sp_executesql stored procedure supports parameters. So, using the sp_executesql stored procedure instead of the EXECUTE statement improve readability of your code when there are many parameters are used. When you use the sp_executesql stored procedure to executes a Transact-SQL statements that will be reused many times, the SQL Server query optimizer will reuse the execution plan it generates for the first execution when the change in parameter values to the statement is the only variation.

*****

Use sp_executesql stored procedure instead of temporary stored procedures.
Microsoft recommends to use the temporary stored procedures when connecting to earlier versions of SQL Server that do not support the reuse of execution plans. Applications connecting to SQL Server 7.0 or SQL Server 2000 should use the sp_executesql system stored procedure instead of temporary stored procedures to have a better chance to reuse the execution plans.

*****

If you have a very large stored procedure, try to break down this stored procedure into several sub-procedures, and call them from a controlling stored procedure.
The stored procedure will be recompiled when any structural changes were made to a table or view referenced by the stored procedure (for example, ALTER TABLE statement), or when a large number of INSERTS, UPDATES or DELETES are made to a table referenced by a stored procedure. So, if you break down a very large stored procedure into several sub-procedures, you get chance that only a single sub-procedure will be recompiled, but other sub-procedures will not.

*****

Try to avoid using temporary tables inside your stored procedure.
Using temporary tables inside stored procedure reduces the chance to reuse the execution plan.

*****

Try to avoid using DDL (Data Definition Language) statements inside your stored procedure.
Using DDL statements inside stored procedure reduces the chance to reuse the execution plan.

*****

Add the WITH RECOMPILE option to the CREATE PROCEDURE statement if you know that your query will vary each time it is run from the stored procedure.
The WITH RECOMPILE option prevents reusing the stored procedure execution plan, so SQL Server does not cache a plan for this procedure and the procedure is recompiled at run time. Using the WITH RECOMPILE option can boost performance if your query will vary each time it is run from the stored procedure because in this case the wrong execution plan will not be used.

*****

Use SQL Server Profiler to determine which stored procedures has been recompiled too often.
To check the stored procedure has been recompiled, run SQL Server Profiler and choose to trace the event in the "Stored Procedures" category called "SP:Recompile". You can also trace the event "SP:StmtStarting" to see at what point in the procedure it is being recompiled. When you identify these stored procedures, you can take some correction actions to reduce or eliminate the excessive recompilations.

Technology overflow

Sunday, March 27, 2011

SQL Server Few Tips & Tricks

Cursor Optimization Tips

Index Optimization Tips

Transact-SQL Optimization Tips

Miscellaneous Optimization Tips

Stored Procedures Optimization Tips

No comments:

Post a Comment

Contributors

Followers

Blog Archive