Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why NonClustered index scan faster than Clustered Index scan?

As I know, heap tables are tables without clustered index and has no physical order. I have a heap table "scan" with 120k rows and I am using this select:

SELECT id FROM scan

If I create a non-clustered index for the column "id", I get 223 physical reads. If I remove the non-clustered index and alter the table to make "id" my primary key (and so my clustered index), I get 515 physical reads.

If the clustered index table is something like this picture:

enter image description here

Why Clustered Index Scans workw like the table scan? (or worse in case of retrieving all rows). Why it is not using the "clustered index table" that has less blocks and already has the ID that I need?

like image 843
Mucida Avatar asked Oct 18 '25 11:10

Mucida


2 Answers

SQL Server indices are b-trees. A non-clustered index just contains the indexed columns, with the leaf nodes of the b-tree being pointers to the approprate data page. A clustered index is different: its leaf nodes are the data page itself and the clustered index's b-tree becomes the backing store for the table itself; the heap ceases to exist for the table.

Your non-clustered index contains a single, presumably integer column. It's a small, compact index to start with. Your query select id from scan has a covering index: the query can be satisfied just by examining the index, which is what is happening. If, however, your query included columns not in the index, assuming the optimizer elected to use the non-clustered index, an additional lookup would be required to fetch the data pages required, either from the clustering index or from the heap.

To understand what's going on, you need to examine the execution plan selected by the optimizer:

  • See Displaying Graphical Execution Plans
  • See Red Gate's SQL Server Execution Plans, by Grant Fritchey
like image 149
Nicholas Carey Avatar answered Oct 21 '25 03:10

Nicholas Carey


A clustered index generally is about as big as the same data in a heap would be (assuming the same page fullness). It should use just a little more reads than a heap would use because of additional B-tree levels.

A CI cannot be smaller than a heap would be. I don't see why you would think that. Most of the size of a partition (be it a heap or a tree) is in the data.

Note, that less physical reads does not necessarily translate to a query being faster. Random IO can be 100x slower than sequential IO.

like image 27
usr Avatar answered Oct 21 '25 01:10

usr