SQL Server and BI/BA With A Smile: Execution Plans – Table, Clustered Index and Non-Clustered Index Scan

Monday, April 2, 2012

Execution Plans – Table, Clustered Index and Non-Clustered Index Scan

An important operator to understand in execution plans is a scan. A Scan can be good and bad, so understanding the difference between them is a basic skill needed by a DBA or developer that is going to do performance tuning.

The Table Scan means that a table does not have a clustered index and the optimal query plan decided to scan all leaf level pages in the table. The table is stored as a Heap because there is not clustered index. A table can have only one clustered index.

We are going to drop the primary key/clustered index on the DatabaseLog table of the AdventureWorks database.

IF  EXISTS (SELECT * FROM sys.indexes WHERE object_id = OBJECT_ID(N'[dbo].[DatabaseLog]') AND name = N'PK_DatabaseLog')ALTER TABLE [dbo].[DatabaseLog] DROP CONSTRAINT [PK_DatabaseLog]GO

Then, we are going to execute the following code with Include Actual Execution Plan query option activated. In SQL Server Management Studio open a query window, either right-click the query window and choose Include Actual Execution Plan or select Include Actual Execution Plan from the Query menu.

Execute this code:

SELECT [DatabaseLogID]      ,[PostTime]      ,[DatabaseUser]      ,[Event]      ,[Schema]      ,[Object]      ,[TSQL]      ,[XmlEvent]  FROM [dbo].[DatabaseLog]

You will get the following execution plan table scan

The optimizer says a full Table Scan is the best plan. The operator Table Scan indicates that the table does not have a clustered index. By default, creating a primary key will make the key a clustered index, but that is not always the case. I am working on a Fact table at work where a Unique Constraint is the clustered index, but the primary key is a 7 column compound key of Surrogate Keys.

By adding a Clustered Index to table DatabaseLog (Primary Key), we can change this Table scan to a clustered index scan. A clustered index changes the logical/physical structure of the table to follow an index tree.

ALTER TABLE [dbo].[DatabaseLog]     ADD  CONSTRAINT [PK_DatabaseLog]         PRIMARY KEY CLUSTERED         ([DatabaseLogID] ASC) 

We get the following if we execute the SELECT query again like above:

So, what is the difference in Cost? None.

So what is the big deal, they both costs the same. Well, when you start using a WHERE clause and the no clustered index table still does a full scan, but the clustered index might do an Clustered Index Seek if the density/cardinality gives the optimizer a better choice, you will see the performance difference. More on this is a later blog.

My advice, ALWAYS have a primary AND/OR a clustered index on all tables. The decision on making the primary key a clustered index is dependent on your experience in managing and planning the correct implementation.

All right, lastly we will see a Index Scan. First, we will add a Non-clustered index to the table DatabaseLog.

CREATE NONCLUSTERED INDEX [idxDatabaseLog_DatabaseUserScehmaObject]     ON [dbo].[DatabaseLog](    [DatabaseUser], [Schema], [Object] )

Now, the query optimizer creates a plan to scan the new non-clustered index to retrieve the data needed in the SELECT statement. If we add a WHERE clause to the SELECT, it will change to a Index Seek.

SELECT [DatabaseUser], [Schema], [Object]  FROM [dbo].[DatabaseLog]  WHERE DatabaseUser = 'dbo'

Reading execution plans can be a lot easier than understanding them to help performance tuning. I know I spent many hours thinking I finally got it, before some article on SQL Server Central website or SQL Server Magazine changed the understanding I have about T-SQL statements and indexing for performance. But, you have to start somewhere, then keep improving your understanding.

5 comments:

UnknownJanuary 3, 2013 at 9:57 PM
thanks. made my concepts a little bit clear.
ReplyDelete
Replies
Thomas LeBlancJanuary 4, 2013 at 6:13 AM
not problem, love to share what has been taught to me from others, as well as experience...

Thomas
ReplyDelete
Replies
gorakshnathFebruary 11, 2013 at 5:43 AM
who decide which index should be used , i have one database on two different environment. Its same database dump - indexes , columns are same , then one execution plan has with non clustered index & other with clustered index.
ReplyDelete
Replies
gorakshnathFebruary 11, 2013 at 5:44 AM
Its for same query ...
ReplyDelete
Replies
Thomas LeBlancFebruary 19, 2013 at 6:20 AM
You might want to Reindex and/or Update statistics on both databases then look at execution plans again.

Thanks,
Thomas
ReplyDelete
Replies

About Thomas LeBlanc

Thomas LeBlanc (Microsoft MVP in Data Technology Platform) is a Data Warehouse Architect in Baton Rouge, and uses his 28+ years in IT to help develop OLTP systems with normalized databases for high-performing T-SQL and end-to-end dimensional data marts using SSIS, SSAS, Power BI, and Excel. His SQL Server certifications include MCSA 12, MCITP 08 BI and DBA, MCITP 2005 DBA, and MCDBA 2000. As a PASS volunteer, he is past chair of Data Arch VC & Excel BI VC  and involved with Baton Rouge SQL Server User Group and volunteers for SQLSaturday.